WO2017002130A1 - Transformation of marked-up content to a reversible file format for automated browser based pagination - Google Patents

Transformation of marked-up content to a reversible file format for automated browser based pagination Download PDF

Info

Publication number
WO2017002130A1
WO2017002130A1 PCT/IN2016/000159 IN2016000159W WO2017002130A1 WO 2017002130 A1 WO2017002130 A1 WO 2017002130A1 IN 2016000159 W IN2016000159 W IN 2016000159W WO 2017002130 A1 WO2017002130 A1 WO 2017002130A1
Authority
WO
WIPO (PCT)
Prior art keywords
span
file format
content
page
marked
Prior art date
Application number
PCT/IN2016/000159
Other languages
French (fr)
Inventor
Venkatesan Sumangali KIDAMBI
Srikanth VITTAL
Bhaskar Mannargudi VENKATRAMAN
Original Assignee
Tnq Books And Journals Private Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tnq Books And Journals Private Limited filed Critical Tnq Books And Journals Private Limited
Priority to EP16817386.2A priority Critical patent/EP3317780A4/en
Priority to US15/551,292 priority patent/US10157238B2/en
Publication of WO2017002130A1 publication Critical patent/WO2017002130A1/en
Priority to US15/695,017 priority patent/US10318614B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/84Mapping; Conversion
    • G06F16/88Mark-up to mark-up conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/114Pagination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes

Definitions

  • a typical markup language document is made of different types of content, for example, textual content, images, videos, etc., and carries syntax information that instructs a browser how to render different types of content in the markup language document to a user.
  • the syntax information comprises a set of markup language tags that are executed on the browser.
  • rendering a document on a browser can be controlled, for example, by using cascading style sheets (CSS) that describe the formatting of a document written in a markup language.
  • CSS cascading style sheets
  • a CSS document is typically attached, embedded, or linked to a markup language document. The CSS defines how each element, for example, font size of text, color of a background or text, position and alignment of content elements, etc., in the markup language document appears on the browser.
  • markup language documents are typically displayed as continuous running documents without any page breaks. These continuous running documents are not print-friendly.
  • a typical markup language document can accommodate a large amount of content, whereas a standard print ready page has, for example, 8.5" x 1 1 " dimensions with margin ' s that reduce the space available for accommodation of a large amount of content during a print operation.
  • the content has to be broken at two levels, that is, a horizontal level or page width and a vertical level or page height.
  • the page width relates to a line break, and the page height relates to a page break.
  • Content rendering on a browser can have loose lines, and spaces are often distributed in ways that make a page appear to have rivers of blanks flowing through the page.
  • Line breaks rendered by the browser can be discerned as belonging to four different types, namely, word space breaks (wsbr), soft hyphen breaks (wshbr), hard breaks (wbr), and para breaks (wsp).
  • Word space breaks are discerned by finding which spaces are quashed to a zero width. The word space breaks are then interpreted as the end of a line or a line break.
  • soft hyphens if a line breaks in a soft hyphen, then the soft hyphen attains a non-zero width which is also interpreted as the end of the line or as a line break.
  • a hard line break can be discerned when an offset decrease is encountered. Therefore, any markup language content that falls outside a printing area needs to be resized and repositioned accordingly for an optimal print output without losing any data when a print operation is performed.
  • One method for printing continuous running pages involves introducing page breaks based on a vertical height equal to a page of printing media upon which the content is to be printed. The problem with relying on introducing page breaks based on the vertical height is that text lines and other content are disrupted in between a page and the same is printed. There are additional problems, for example, numbering the pages as page numbers are forced and not based on the content, page layout issues on print media and on handheld devices, etc. Floats such as images and tables can split and spill across pages and trying to avoid these can result in large vertical gaps, making the presentation undesirable.
  • markup language documents for example, hypertext markup language (HTML) documents
  • word spaces and line breaks are not explicitly tagged.
  • the word spaces and the line breaks _remain.anonymous, significantlyfor_example,-as-genemc word-spaces and-line-breaks, and-hence-are-diffieult- to read and understand for printing accurately.
  • handheld devices for example, smartphones, tablets, etc.
  • the non-print-friendly documents, page numbering issues, and other page layout problems still exist in fluid pages.
  • HTML documents are typically interactive and dynamic in nature, whereas the print is essentially static in nature.
  • hypertext markup language (HTML) documents contain free flowing or reflowing content. Images, paragraphs, videos and other similar content are arranged in an HTML document as tags. HTML documents are adaptable to different devices. That is, if an HTML document is viewed in a web browser, then the HTML document adapts to the web browser and displays content of the HTML document as per the specifications of the web browser. If this HTML document is viewed on a mobile browser of a mobile device, then the HTML document adapts to the specifications of the mobile browser.
  • the HTML content is not suitable to print. Since the HTML content is not fixed, a printer would interpret specific elements of the HTML content inaccurately and therefore print the HTML content inaccurately. While there are many transformation techniques and file formats, these file formats are not reversible and do not restore fluidity of the transformed markup language documents. One of the main reasons that the fluidity cannot be restored is that the page output in non-reversible file formats are defined graphically as a set of printer instructions at a glyph level that lose structural information at a character level and a content level.
  • Markup language content and associated content elements are interpreted and defined using markup language tags on any standard web browser.
  • the tags included in a markup language document are typically executed on a server or on a web browser. Scripts or tags that run directly on a web browser have less latency time compared to a server side execution of tags.
  • a server side execution of tags requires an active network connection, whereas a client side execution of web browser compatible tags runs without an active network connection.
  • Most textual markup language documents are rendered in a client-server architecture, where there are delays and additional communication cost between a server and a user's client device for presenting and printing markup language documents.
  • Pagination of a hypertext markup language (HTML) document involves partitioning content of the HTML document and presenting the partitioned content on individual pages.
  • Conventional solutions include pagination of HTML documents based either on cut-off markers or the number of items to be displayed per page. These solutions are typically implemented using server side technologies. There is a need for a client side implementation, and there have been a few attempts at client side pagination due to the improved performance that the client side pagination can yield.
  • US Patent No. 7,647,553 B2 provides a hypertext markup language view template that allows a hypertext markup language content document to flow into a series of containers. This is performed by identifying the layout of the hypertext markup language document by using view templates.
  • a hypertext markup language authorship is provided that takes a bottomless continuous running hypertext markup language page and positions the content in a series of predefined containers within the display media. The content is flowed into the predefined containers.
  • This method does not handle the positioning of footnotes on the same page where respective footnote citations reside, which makes it difficult for a user to refer to citations.
  • This method also does not place floats proximate to their corresponding citations, which makes it difficult for the user to access floats corresponding to the citations. Furthermore, this method does not address header and footer conversion issues.
  • US Patent No. 6,789,229 Bl addresses issues with pagination that involves more processor intensive tasks.
  • This method uses pagination techniques that involve determining reproducible pages followed by numbering individual pages based on hard breaks.
  • This method requires a predetermined list of hard breaks occurring in the document being processed which requires a lot of processing time to display page numbers and therefore, there is a need for a -; faster and efficient technique to process page numbers.
  • a publication by Hewlett-Packard Laboratories titled "Automatic Pagination of HTML Documents in a Web Browser” discloses automatic pagination of hypertext markup language (HTML) documents on the client side.
  • the methods disclosed in this publication utilize a built-in library of JavaScript ® functions in a browser and size attributes to format an HTML page.
  • the paginations are performed through extensible stylesheet language transformation (XSLT). These pagination techniques render page numbers in tabs which occupy more space if the number of pages is large. These methods do not handle page numbers when a print operation is initiated. Moreover, these methods do not position floats and footnotes on the same page where their respective citations reside.
  • PDF portable document format
  • ePub ® electronic publication
  • Open eBook Forum DBA electronic publication
  • PDF portable document format
  • ePub ® electronic publication
  • the portable document format is based on a fixed layout and does not support a fluid layout. Page numbers in the portable document format are forced and not based on the content.
  • the ePub file format is designed with reflowable content, which can optimize text and graphics according to a display device.
  • the ePub file format does not support header and footer at a conversion stage, places floats at random locations, and does not proxy floats, for example, videos and long tables to a linked source, thereby hindering the user experience.
  • a computer implemented method and a file format transformation system deployed on a client device that transforms marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format that can be stored offline, executed with less latency and without an active network connection on any browser on any operating system, and can be restored to a continuous page.
  • a first file format for example, a hypertext markup language (HTML) format
  • HTML hypertext markup language
  • a computer implemented method and a file format transformation system that implements document tagging of all content including spaces and line breaks to transform fluid pages to fixed pages that are print-friendly and provide a fixed page view that captures document elements, for example, line breaks, floats, footnotes or end notes, page numbers, headers and footers, captions, etc., which are expressed relationally and assigned page appropriate placement.
  • a computer implemented method and a file format transformation system that position floats and footnotes on the same page where their respective citations reside, support headers and footers at a conversion stage, place floats at appropriate locations, and proxy floats, for example, videos and long tables to a linked source, thereby enhancing the user experience.
  • the method and the file format transformation system (FFTS) disclosed herein address the above stated need for transforming marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format that can be stored offline, executed with less latency and without an active network connection on any browser on any operating system, and can be restored to a continuous page.
  • a first file format for example, a hypertext markup language (HTML) format
  • HTTP hypertext markup language
  • FFTS file format transformation system
  • the method and the FFTS disclosed herein implement document tagging of all content including spaces and line breaks to transform fluid pages to fixed pages that are print-friendly and provide a fixed page view that captures document elements, for example, line breaks, floats, footnotes or end notes, page numbers, headers and footers, captions, etc., which are expressed relationally and assigned page appropriate placement.
  • the client side implementation of the method and the file format transformation system (FFTS) disclosed herein allows a user of a document to be presented with an alternate presentation of the document without additional communication costs between a server and the user's client device.
  • the client side implementation of the method and the FFTS disclosed herein enables automated browser based pagination of markup language documents, for example, hypertext markup language (HTML) documents based on the dimensions of a web browser's window and the rendered size of components.
  • HTML hypertext markup language
  • the reversible file format allows a user to view the page-broken document as a ⁇ continuous document on a browser. The user can switch between the two views.
  • the computer implemented method and the FFTS disclosed herein position floats and footnotes on the same page where their respective citations reside, support headers and footers at a conversion stage, place floats at appropriate locations, and proxy floats, for example, videos and long tables to a linked source, thereby enhancing he user experience.
  • the computer implemented method disclosed herein is minimalistic in terms of document object model (DOM) manipulation and performs minimum manipulation to create pages.
  • DOM document object model
  • FFTS transformation system deployed on a client device comprising at least one processor configured to execute computer program instructions for transforming marked-up content in a first file format to a reversible second file format.
  • the FFTS receives the marked-up content of the first file format.
  • the FFTS reflows the received marked-up content of the first file format into a continuous page having a configurable page width.
  • the FFTS identifies spaces and block elements in the reflown marked-up content of the first file format.
  • the FFTS generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format.
  • the FFTS determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags and tags the determined line breaks. For each of the determined line breaks, the FFTS identifies anchored floats, for example, figures, tables, images, videos, etc., in the reflown marked-up content of the first file format and tags the identified anchored floats. The FFTS positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page. The FFTS identifies footnotes in the reflown marked-up content of the first file format and tags the identified footnotes.
  • anchored floats for example, figures, tables, images, videos, etc.
  • the FFTS positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page.
  • the FFTS groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
  • the FFTS inserts one or more of multiple pagination elements, for examples, page numbers, a header, a footer, etc., on each page containing the grouped marked-up content.
  • the FFTS renders the grouped marked-up content with the inserted pagination elements in the reversible second file format.
  • related systems comprise circuitry and/or programming for effecting the methods disclosed herein; the circuitry and/or programming can be any combination thereof.
  • FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a reversible second file format.
  • FIG. 2 exemplarily illustrates an interpretation of marked-up content in a reversible second file format.
  • FIGS. 3A-3F exemplarily illustrate a flowchart comprising the steps performed by a file format transformation system for transforming marked-up content in a first file format to a reversible second file format.
  • FIG. 4C exemplarily illustrates a screenshot showing a proof view of the marked-up content rendered in a reversible second file format.
  • FIG. 4D exemplarily illustrates a screenshot showing a source code of the marked-up content rendered in a reversible second file format.
  • FIG. 5 illustrates a system comprising a file format transformation system deployed on a client device for transforming marked-up content in a first file format to a reversible second file format.
  • FIG. 6 exemplarily illustrates the hardware architecture of a client device that deploys the file format transformation system for transforming marked-up content in a first file format to a reversible second file format.
  • FIGS. 7A-7Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible second file format in edit and proof views.
  • FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a reversible second file format, hereafter referred to as a
  • markup content refers to content having markups or appended tags that indicate the type of content, for example, a header, a footer, a caption, a table, a figure, an image, a video, a line break, etc.
  • line break refers to a pagination element representing the end of a line of text.
  • reversible file format refers to a file format that can be back transformed into the first file format.
  • the reversible file format disclosed herein is named, for example, as "PH5" that represents pagination with hypertext markup language 5 (HTML5) and comprises a set of properties including tags that are generated in accordance with structural semantics of documents in the first file format, for example, hypertext markup language (HTML) documents, and recognizes scripts that shape the PH5 output.
  • HTML5 hypertext markup language
  • the scripts that shape the PH5 output vary.
  • the computer implemented method disclosed herein employs a file format transformation system (EETS)-deployed on-axlientdevice omprising at least one ⁇ rocessor-configured-to- execute computer program instructions for transforming marked-up content in a first file format to a reversible second file format.
  • the client device is a computing device, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, etc.
  • the FFTS converts web content seamlessly using document tagging.
  • the file format transformation system (FFTS) receives 101 marked-up content of a first file format, for example, a hypertext markup language (HTML), an extensible hypertext markup language format
  • the FFTS receives document contents, for example, in the HTML format.
  • the first file format is an extensible markup language (XML).
  • the FFTS converts a document from the XML format to an HTML format and then transforms the mark-up content in the HTML format to the reversible file format.
  • a browser that loads the marked-up content of the first file format inserts code points, for example, soft hyphens in the marked-up content of the first file format based on dictionary elements, for example, dictionary syllables such as - im-por-tant, con-se-quence, ap-pear-ance, etc.
  • a "soft hyphen” refers to a code point reserved in coded character sets used for breaking words across lines by inserting visible hyphens. Unicode defines the soft hyphens as invisible characters that allow a manual specification of a position where a hyphenated break is allowed without forcing a line break in an inconvenient place if the content or text is later reflowed.
  • the FFTS reflows 102 the received marked-up content of the first file format into a continuous page having a configurable page width.
  • the term "reflow” refers to a browser process of recalculating positions of HTML elements in the HTML content and re-rendering the HTML elements with new positions.
  • the file format transformation system identifies 103 spaces and block elements in the reflown marked-up content of the first file format.
  • the FFTS identifies existing break elements, for example, hard breaks such as soft hyphen breaks, line breaks, and para breaks in the reflown marked-up content of the first file format.
  • the FFTS also identifies unanchored or uncited floats in the reflown marked-up content of the first file format.
  • the block elements are content elements that create blocks or large groupings of content and generally begin new lines of text. The block elements expand to fill a parent container containing text, inline elements, etc., and can have margins and/or padding, fitting the child elements.
  • the ⁇ div> element is a block element in the hypertext markup language (HTML).
  • the block elements for example, ( ⁇ div>, ⁇ hl> - ⁇ h6>, ⁇ p>) in a document start on a new line and take up the full width available.
  • word space refers to a single space between two words.
  • the FFTS tags the identified block elements, for example, as ⁇ div class WSP>, where "WSP" refers to para break.
  • floats and footnotes have prior representation in an input document of the first file format, for example, the HTML format and need no specific tagging.
  • floats refer, for example, to images, videos, audio content, tables, figures, etc., that float unhinged from the main content flow, except in their relationship to their citations as available in the input document.
  • footnotes refers to content that is intended to be placed at the bottom of a page and used to cite references to content on the page.
  • Image floats have, for example, ⁇ img> tags.
  • Table floats can be recognized by the presence of various tag elements, for example, ⁇ td>, ⁇ tr>, etc. Footnotes are in a number series and are shown as superscript ⁇ sup> numbers that are assigned to specific locations in the main content flow, and these superscripts reference notes appended to the main content, for example, at the bottom in a continuous page.
  • the file format transformation system determines 106 line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags
  • the FFTS identifies the line breaks through JavaScript ® developed by Sun Microsystems, Inc. .
  • the file format transformation system identifies 108 anchored floats in the reflown marked-up content of the first file format and tags the identified anchored floats.
  • the FFTS positions 109 the tagged anchored floats on a current page based on availability of space for the identified anchored floats on the current page.
  • the FFTS positions the tagged anchored floats proximal to associated float citations on the current page based on availability of space for the tagged anchored floats on the current page.
  • the FFTS identifies 110 footnotes in the reflown marked-up content of the first file format and tags the -identified-footnotes.
  • the FFTS positions 111 the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page.
  • the FFTS positions the tagged footnotes proximal to associated footnote citations on the current page based on availability of space for the tagged footnotes on the current page.
  • page break refers to a marker that indicates that content which follows the marker is part of a new page.
  • the FFTS groups 113 the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
  • the FFTS tags the line breaks, for example, as ⁇ span data-ph5 "wsbr">.
  • paragraph break refers to a pagination element representing the end of a paragraph.
  • the paragraph break is a non-intrusive data model that preserves an original data model of the hypertext markup language (HTML).
  • the FFTS tags the paragraph breaks, for example, as ⁇ div data-ph5 "wsp">.
  • the file format transformation system positions the floats, for example, figures, tables, text boxes, etc., closer to anchors within the available space. Where anchors are not available, the FFTS appends anchors at the input location of the float.
  • the FFTS initially positions the floats near their anchors and then moves the floats to the bottom or top of the current page, or to one of the following pages according to the availability of space similar to footnotes.
  • the FFTS handles the grouped elements comprising, for example, a float and a caption associated with the float in the reversible file format at a position assigned in the marked-up content of the first file format to the float.
  • the file format transformation system declares uniform resource locater (URL) breaks to a paging engine.
  • the FFTS couples expressions such as footnotes to page breaks.
  • the page break breaks a web page into a predefined length and delivers cut pages, while ensuring headings and words at the beginning and end paragraphs are not widowed or orphaned.
  • the FFTS introduces page breaks when a script cookie cuts the fluid page to a reference dimension.
  • the FFTS initially positions footnotes next to the corresponding citations.
  • the FFTS moves the footnotes to the footer section of the page after introduction of the page breaks.
  • the FFTS tags the footnotes, for example, as ⁇ div data-ph5- 'footnote">, where the first footnote comprises an additional class called "firstFootnote” and the rest of the footnotes comprise an additional class called "notFirstFootnote".
  • the FFTS numbers the footnotes and positions the footnotes at the bottom of the relevant page.
  • the file format transformation system inserts page numbers, a header, a footer, a footnote ruler, fillers, etc., or any combination thereof in one or more pages in the reversible second file format.
  • the FFTS inserts page numbers on the pages based on a predefined numbering style.
  • the FFTS inserts the footnote ruler, for example, as a horizontal line to separate running text and the footnotes.
  • the FFTS tags the footnote ruler, for example, as ⁇ div data-ph5- 'footNoteRuler">.
  • the FFTS allows the footnote ruler to be tweaked on and off in the cut pages.
  • the FFTS uses filler compensation for eliminating orphans, widows, and divorce between couples, for example, section heading and paragraph, figure and table, table heading and table, etc.
  • the FFTS automatically deploys fillers, for example, line spaces, if needed, to fill to a page to increase aesthetics.
  • The-file-format_transformation-system_(FFTS) renders J15 he_grouped.marked-up_ content with the inserted pagination elements in the reversible file format.
  • the FFTS compiles and positions the reflown marked-up content and the pagination elements with associated properties at predetermined context based positions across multiple pages based on page dimensions and the appended tags.
  • the FFTS performs hyphenation and justification in the rendered marked-up content in the reversible second file format to provide kerning based on aesthetics, for example, for avoidance of loose lines and blank rivers.
  • the reversible second file format allows the marked-up content to be reversed to the first file format to restore the. continuous page.
  • the rendered marked-up content in the reversible second file format is ) accessible on one or more multiple browsers on multiple operating systems.
  • the fixed page in the reversible file format to which the marked-up content in the first file format is transformed is expressed, for example, as a pixel dimension equivalent of a paper size or a device size.
  • the data model of the reversible file format for example, referred to as the PH5 format transforms a fluid page, for example, in a hypertext markup language (HTML) format to a fixed page, for example, in the reversible file format or the PH5 format, where the transformation is reversible. That is, the FFTS interprets a fluid page and delivers a fixed page.
  • HTML hypertext markup language
  • the tagged input allows the transformation of a fluid page to a fixed page.
  • the enriched inheritance comprises page breaks.
  • the other elements are defined in terms of the page breaks.
  • the extension of the fixed page in the PH5 format is, for example, .PH5.
  • the FFTS bridges fluid web-content and fixed-page typesetting, originating as a fluid HTML, without a reference printer at the destination.
  • the PH5 format is similar, for example, to a zip file format such as an electronic publication (ePub) format and can be opened in a common browser on any operating system in a fixed page view.
  • a PH5 file can be back-transformed into a standard hypertext markup language (HTML) file from which the PH5 file was generated with the fluidity of the HTML file restored.
  • HTML hypertext markup language
  • the file format transformation system performs document intelligence tagging. Tagging the spaces or blanks effects visible content for emulation and standardization.
  • line break candidates are identified and marked up as page breaks. With this method, implicit statements in the document are understood and tagged for downstream machine reading or paging.
  • the transformation from a fluid file format to the reversible file format, that is, the PH5 file format is accomplished subject to the availability of a tag set that exposes an understanding of document semantics to scripts that generate the PH5 package.
  • Tag set allows creation of a fixed page view that captures document elements that are expressed relationally and that are then assigned page-and-context-appropriate placemen and styling.
  • -A ⁇ PH5-file,- as a portable document anticipates Jhe.tag. set in_a_work_ queue and defines a standard for creating the same.
  • the PH5 files do not need reference printers, driver installations, configuration of printer settings, etc., and also do not need a reader application or a browser plug-in. Furthermore, the PH5 files allow offline storage of information.
  • FIG. 2 exemplarily illustrates an interpretation of marked-up content in a reversible second file format, herein referred to as a PH5 format.
  • a typical HTML page does not have tags specified for spaces.
  • the HTML page comprises a header, a footer, footnotes, floats such as figures, tables, images, video, audio, etc.
  • the file format transformation system FFTS
  • HTML hypertext markup language
  • the FFTS system performs tagging without replacing the original HTML tags, thereby preserving the original HTML tags to allow the final output reversible file format to be reverted back into HTML page, if a user wants to suppress the changes and revert back to the HTML page.
  • the tagged HTML page that is, the PH5 page 201 exemplarily illustrated in FIG. 2, contains all the content of the original input HTML page along with the PH5 format tags.
  • the tagging process allows the FFTS to transform a fluid HTML page into; fixed HTML pages.
  • a fluid HTML page contains responsive content elements that resize their position and geometry according to a web browser width.
  • the FFTS positions any available footnotes proximate to a respective citation and once the page breaks are introduced, the FFTS tags the footnotes ⁇ div data-ph5- 'footnote"> and positions the footnotes at the bottom of the page.
  • the FFTS places an additional class tag after the first footnote "firstFootnote” and tags the following footnotes with an additional tag
  • the FFTS renders the PH5 page 201 with the PH5 tags disclosed above.
  • the file format transformation system (FFTS) performs PH5 tag recognition for automated browser based pagination and generates output pages 202.
  • the FFTS recognizes the PH5 format tags appended in the PH5 tagged hypertext markup language (HTML) page 201.
  • HTML hypertext markup language
  • the PH5 tagged HTML page 201 comprises two figures labeled as "FIG 1" and "FIG 2" along with another float.
  • the FFTS encounters the float tag of the first float, that is, FIG 1, in the PH5 tagged HTML page 201 and positions "FIG 1 " proximal to the corresponding citation until the FFTS encounters a page break tag.
  • the FFTS positions the float "FIG 1" at the top or bottom of the page close to the respective citation.
  • the FFTS then allows the reflow of the HTML content to fit in the specified page width.
  • the FFTS upon recognizing, the footnote tag introduces the footnote matter at the bottom of the page in close proximity to the respective citation.
  • the FFTS introduces a footnote ruler to separate the main content from the footnote matter upon recognition of the footnote tag.
  • the FFTS further encounters a page number tag and introduces a page number at the bottom of the page after the footnote matter.
  • the FFTS then breaks the page into an individual page after encountering the page break tag which is placed based on the reference page height.
  • the file format transformation system then proceeds to the next section after the page break tag, proxies "FIG 2" and other floats, for example, audio, video, tables, etc., to a linked source, positions these floats according to the availability of space, positions page breaks according to the pixel dimension of the page, and inserts a page number for the current page.
  • the FFTS then proceeds to the next section after the page break tag, positions the remaining footnotes on the next page, and inserts a page number for the next page.
  • the FFTS performs the page transformation process until the last page break tag is recognized.
  • FIGS. 3A-3F exemplaril-y-iJlustrate-a_flowchart comprising-the-steps-performed by the. file format transformation system (FFTS) for transforming marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format, hereafter referred to as a "reversible file format".
  • FFTS file format transformation system
  • the FFTS loads 301 the HTML content with cascading styling sheets (CSS) in a browser and examines 302 the loaded HTML content.
  • CSS cascading styling sheets
  • the FFTS analyzes and describes syntactic roles of the HTML content.
  • the FFTS introduces 303 hidden code points, for example, soft hyphens into the HTML content based on popular dictionary elements, for example, dictionary syllables.
  • the FFTS then reflows 304 the HTML content to fit a desired page width with a running continuous page height.
  • the reflow process is used in a markup language document to render the markup language document to different types of user devices.
  • the FFTS performs word spacing according to a kerning of a selected font.
  • the FFTS also identifies 307 block elements in the reflown HTML content and introduces 308 a tag, for example, a ⁇ div class WSP> tag for each of the identified block elements in the reflown HTML content, where "WSP" refers to word space paragraph.
  • WSP refers to word space paragraph.
  • ⁇ div> refers to a markup language tag that defines a container for holding content elements.
  • the file format transformation system iteratively processes the generated tags and identifies, for each of the identified spaces and the identified block elements, one or more pagination elements in the reflown HTML content.
  • the FFTS identifies pagination elements such as line breaks, floats, and footnotes as exemplarily illustrated in FIGS. 3C-3E respectively.
  • FIG. 3B exemplarily illustrates iteration steps performed by the file format
  • the file format transformation system (FFTS) iterates 309 the steps of determining and assignment of line breaks for every occurrence of the ⁇ WS> tag and the
  • ⁇ WSP> tag as exemplarily illustrated in FIG. 3C, until all the ⁇ WS> tags and the ⁇ WSP> tags are processed 310.
  • the file format transformation system (FFTS) after processing all the ⁇ WS> and ⁇ WSP> tags, iterates 311 all the line breaks and then proceeds to the steps exemplarily illustrated in FIG. 3D-3E.
  • FIG. 3C exemplarily illustrates determination and assignment of line breaks at appropriate-positions-in-the-reversible-file format.
  • the EETS determines_and assigns line breaks ⁇ upon encountering any one of the following conditions: If the word space ⁇ WS> equals zero, then the FFTS identifies 312 the word space as a line break; if a soft hyphen ⁇ SHY> is not equal to zero, then the FFTS identifies 313 the soft hyphen as a line break; and if the FFTS identifies a paragraph break, the FFTS forces 314 a line break. After assigning the line breaks, the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIG. 3B.
  • FIG. 3B exemplarily illustrates determination and assignment of line breaks at appropriate-positions-in-the-reversible-file format.
  • 3D exemplarily illustrates positioning of floats proximate to a first citation in the reflown HTML content.
  • the file format transformation system FFTS
  • FFTS file format transformation system
  • the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, while keeping track of the pixels covered. If the FFTS encounters a float before the 500 pixel height, the FFTS analyzes the float pixel dimension and the pixels covered so far and determines the sum of the float pixel dimension and the pixels covered till the point where the float was cited.
  • the FFTS inserts the float on the next available page after the page break, in a way that the float follows the citation but does not precede the citation, and if the sum of the float pixel dimension and the pixels covered till the point where the float was cited is less than the specified page height, then the. float is inserted on the same page proximate to its citation.
  • the FFTS proceeds to the steps exemplarily illustrated in FIG. 3F.
  • FIG. 3E exemplarily illustrates positioning of footnotes at relevant pages in the reversible file format.
  • the file format transformation system identifies footnotes in the reflown hypertext markup language (HTML) content and checks 319 whether space is available for a footnote cited in the reflown HTML content. If space is not available in the current page, the FFTS positions 320 the footnote, that is, the citation point's sentence and matter, on the next page. If there is enough space available in the current page, the FFTS positions 321 the footnote matter on a page footnote section as the footnote is cited. For example, for a page with 500 pixels of fixed height, the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, keeping track of pixels covered.
  • HTML hypertext markup language
  • the FFTS analyzes the corresponding footnote pixel dimensions and the pixels covered so far and determines the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited. If the sum exceeds the specified page height, for example, 500 pixels, the FFTS accommodates the footnote along with its citation on the next available page after the page break, and if the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited is less than the specified page height, then the footnote is accommodated proximate to its citation in the same page at the bottom. After positioning the footnotes at relevant pages in the reversible file format, the FFTS proceeds to the steps exemplarily illustrated in FIG. 3F.
  • FIG. 3F exemplarily illustrates the rendering of the hypertext markup language (HTML) content in the reversible file format.
  • the file format transformation system (FFTS) compares 322 the HTML content with a specified page height and introduces page breaks appropriately into the HTML content.
  • the page breaks break the HTML content into individual pages of a predefined length.
  • the FFTS groups 323 the HTML content on each page from header to footer using a ⁇ div> element.
  • the FFTS inserts 324 page numbers into the individual pages based on a predefined numbering style, a header and footer, and places a footnote ruler wherever necessary.
  • the FFTS checks 325 whether all the line breaks are processed.
  • the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIGS. 3B-3C.
  • the FFTS then delivers 326 the marked-up content in the reversible file format.
  • the FFTS provides an option to revert 327 the changes made in the reversible file format to the first file format, for example, the HTML file format. If a user wants to revert from the reversible file format to the HTML file format, the FFTS suppresses 328 the changes by hiding the changes in a background and displays the input HTML page having the input HTML content.
  • FIGS. 4A-4B exemplarily illustrate screenshots showing edit views of marked-up content.
  • FIG. 4A exemplarily illustrates a screenshot of an input hypertext markup language (HTML) page containing marked-up content without an edit window 402 in a right pane of a graphical user interface (GUI) 401.
  • FIG. 4B exemplarily illustrates a screenshot of the input HTML page containing marked-up content showing the edit window 402 in the right pane of the GUI 401.
  • the source code of the input HTML page in the edit view is provided below:
  • Four identified neurotrophic factors including nerve growth factor, brain-derived neurotrophic factor (BDNF), neurotrophin-3, and neurotrophin-4 exert their effects through binding to two different receptors, the tropomyosin-related kinase (Trk) receptor and the p75 neurotrophin receptor. All
  • Aggregated A ⁇ span class "unicode-char">P ⁇ /span> peptide including both oligomeric and fibrillar species induces neuronal cell death ⁇ span
  • FIG. 4C exemplarily illustrates a screenshot of the input hypertext markup language (HTML) page containing marked-up content in a proof view.
  • HTML hypertext markup language
  • the file format transformation system transforms the input hypertext markup language (HTML) page exemplarily illustrated in FIGS. 4A-4B, to an output page in the reversible file format, that is, the PH5 format as exemplarily illustrated in FIG. 4C.
  • the FFTS further hyphenates words where appropriate.
  • FIG. 4D exemplarily illustrates a screenshot showing a source code of the marked-up content rendered in the reversible file format, that is, the PH5 format.
  • FIG. 5 exemplarily illustrates a system 500 comprising a file format transformation system (FFTS) 502 deployed on a client device 501 for transforming marked-up content in a first file format to a reversible second file format.
  • the client device 501 can be, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, a portable computing device, a laptop, a personal digital assistant, a touch centric device, a workstation, a portable electronic device, a network enabled computing device, an interactive network enabled communication device, any other suitable computing equipment, combinations of multiple pieces of computing equipment, etc.
  • the FFTS 502 is
  • the system 500 disclosed herein comprises a non-transitory computer readable storage medium such as a memory unit, and at least one processor communicatively coupled to the non- transitory computer readable storage medium on the client device 501.
  • non-transitory computer readable storage medium refers to all computer readable media, for example, non-volatile media such as optical discs or magnetic disks, volatile media such as a register memory, a processor cache, etc., and transmission media such as wires that constitute a system bus coupled to the processor, except for a transitory, propagating signal.
  • the non- transitory computer readable storage medium stores computer program instructions defined by modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502.
  • the processor is configured to execute the defined computer program instructions.
  • the file format transformation system (FFTS) 502 further comprises a content reception module 502a, a content reflow module 502b, a space and block identification module 502c, a tagging module 502d, a pagination element processing module 502e, and a compiler 502f.
  • the content reception module 502a receives the marked-up content of the first file format, for example, the hypertext markup language (HTML) format.
  • HTML hypertext markup language
  • An example of a pseudocode of the content reception module 502a executed to receive the marked-up content 3 ⁇ 4f the first file format is provided below: function receiveContent(self, container, source) ⁇
  • var innerContainer null
  • paginator null
  • var content null
  • paginator domHelper.create('div');
  • domHelper.append(paginator, container); content source;
  • innerContainer. innerHTML source
  • the content reflow module 502b reflows the received marked-up content of the first file format into a continuous page having a configurable page width.
  • width self, options .page . width
  • the space and block identification module 502c identifies spaces and block elements in the reflown marked-up content of the first file format.
  • An example of a pseudocode of the space and block identification module 502c executed to identify and tag spaces and block elements in the reflown marked-up hypertext markup language (HTML) content is provided below: function putSpanForWordSpace(self, content) ⁇
  • the tagging module 502d generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format. For each of the identified spaces and the identified block elements, the pagination element processing module 502e determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags. The tagging module 502d tags the determined line breaks.
  • An example of the pseudocode of the pagination element processing module 502 e executed to determine the line breaks is provided below: function determineLineBreaks() ⁇
  • the pagination element processing module 502e For each of the determined line breaks, the pagination element processing module 502e identifies anchored floats in the reflown marked-up content of the first file format. The tagging module 502d tags the identified anchored floats. Further, for each of the determined line breaks, the pagination element processing module 502e positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page. The pagination element processing module 502e positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page.
  • the pagination element processing module 502e identifies footnotes in the reflown marked-up content of the first file format.
  • the tagging module 502d tags the identified footnotes.
  • the pagination element processing module 502e positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page.
  • the pagination element processing module 502e positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page.
  • footnoteHeight getFootnoteHeight(footnoteltem);
  • the pagination element processing module 502e positions page breaks in the continuous page based on a configurable page height and the determined line breaks for the positioning of the tagged anchored floats and the tagged footnotes on a subsequent page on nonavailability of space on the current page.
  • the compiler 502f groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
  • the pagination element processing module 502e inserts one or more pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content.
  • the compiler 502f renders the grouped marked-up content with the inserted pagination elements in the reversible second file format.
  • An example of the pseudocode of the compiler 502f executed for performing the steps of grouping and insertion of page numbers is provided below: function makePageBlocks() ⁇
  • J he pagination element processing-module-502e handles-grouped-elements comprising ⁇ for example, a float and a caption associated with the float in the reversible second file format at a position assigned in the marked-up content of the first file format to the float. If a user wants to revert back to the input marked-up content page, the compiler 502f reverses the marked-up content in the reversible second file format to the first file format to restore the continuous page.
  • An example of the pseudocode of the compiler 502f executed for reversing the PH5 mark-up to the original input (HTML) mark-up is provided below: function removePaginationArtifacts() ⁇
  • var headerFooter content.find(.page-header-footer");
  • var paginationElements content.find(.ws,.shy,.wsp");
  • FIG. 6 exemplarily illustrates the hardware architecture 600 of a client device 501 that deploys the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, for transforming marked-up content in a first file format to a reversible second file format.
  • FFTS file format transformation system
  • the FFTS 502 is deployed on a computer system of the client device 501 and is programmable using a high level computer programming language.
  • the FFTS 502 may be implemented using programmed and purposeful hardware.
  • the hardware architecture 600 of the client device 501 comprises a processor 601, a non-transitory computer readable storage medium such as a memory unit 602 for storing computer programs and data, an input/output (I/O) controller 603, a network interface 604, a data bus 605, a display unit 606, input devices 607, a fixed media drive 608 such as a hard drive, a removable media drive 609 for receiving removable media, output devices 610, etc.
  • the processor 601 refers to any one or more microprocessors, central processing unit (CPU) devices, finite state machines, computers, microcontrollers, digital signal processors, logic, a logic device, an electronic circuit, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a chip, etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions.
  • the processor 601 may also be implemented as a processor set comprising, for example, a programmed microprocessor and a math or graphics co- processor.
  • the processor 601 is selected, for example, from the Intel ® processors such as the Itanium ® microprocessor or the Pentium ® processors, Advanced Micro Devices (AMD ® ) processors such as the Athlon ® processor, UltraSPARC ® processors, microSPARC ® processors, hp ® processors, International Business Machines (IBM ® ) processors such as the PowerPC ® microprocessor, the MIPS ® reduced instruction set computer (RISC) processor of MIPS Technologies, Inc., RISC based computer processors of ARM Holdings, Motorola ® processors, Qualcomm ® processors, etc.
  • the FFTS 502 disclosed herein is not limited to employing a processor 601.
  • the FFTS 502 may also employ a controller or a microcontroller.
  • the processor 601 executes the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 exemplarily illustrated in FIG. 5.
  • the memory unit 602 is used for storing computer programs, applications, and data.
  • the content reception module 502a, the content reflow module 502b, the space and block identification module 502c, the tagging module 502d, the pagination element processing module 502e, the compiler 502f, etc., exemplarily illustrated in FIG. 5, are stored in the memory unit 602 of the client device 501.
  • the memory unit 602 is, for example, a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by the processor 601.
  • the memory unit 602 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 601.
  • the client device 501 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 601.
  • the I/O controller 603 controls input actions and output actions performed by the FFTS 502.
  • the network interface 604 enables connection of the client device 501 to a network, for example, a short range network or a long range network.
  • the network is, for example, the ⁇ internet.
  • the network interface 604 is provided as an interface card also referred to as a line card.
  • the network interface 604 comprises, for example, one or more of an infrared (IR) interface, an interface implementing Wi-Fi ® of Wi-Fi Alliance Corporation, a universal serial bus (USB) interface, a Fire Wire ® interface of Apple Inc., an Ethernet interface, a frame relay interface, a cable interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide ⁇ area " network (WAN " ) " interface interfaces-using serial-protocols, interfaces— using parallel protocols, Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, a high speed serial interface (HSSI), a fiber distributed data interface (FDDI), interfaces based on transmission control protocol (TCP)/internet protocol (IP), interfaces based on wireless communications technology such as satellite technology, radio frequency (RF) technology, near field communication, etc.
  • the data bus 605 permits communications between the modules, for example, 50
  • the display unit 606 via the graphical user interface (GUI) 401 exemplarily illustrated in FIGS. 4A-4C, displays information such as the marked-up content, display interfaces, user interface elements such as text fields, etc., for allowing a user of the file format transformation system (FFTS) 502 to view an input page in a first file format and a transformed output page in the reversible second file format.
  • the display unit 606 comprises, for example, a liquid crystal display, a plasma display, an organic light emitting diode (OLED) based display, etc.
  • the input devices 607 are used for inputting data into the client device 501.
  • the users of the client device 501 use the input devices 607 to provide inputs to the FFTS 502.
  • a user may enter a file format or edit an input page on the GUI 401 using the input devices 607.
  • the input devices 607 are, for example, a keyboard such as an alphanumeric keyboard, a microphone, a joystick, a pointing device such as a computer mouse, a touch pad, a light pen, a physical button, a touch sensitive display device, a track ball, a pointing stick, any device capable of sensing a tactile input, etc.
  • Computer applications and computer programs are used for operating the file format transformation system (FFTS) 502.
  • the computer programs are loaded onto the fixed media drive 608 and into the memory unit 602 of the client device 501 via the removable media drive 609.
  • the computer applications and computer programs may be loaded directly via the network.
  • Computer applications and computer programs are executed by double clicking a related icon displayed on the display unit 606 using one of the input devices 607.
  • the output devices 610 for example, a printer outputs the results of operations performed by the FFTS 502.
  • the FFTS 502 renders the transformed output page in the reversible second file format using the output devices 610.
  • the processor 601 executes an operating system, for example, the Linux ® operatingsystem, the-Unix ® _operating s-ystem,-any version of the Microsoft* Windows* operating-systenv the Mac OS of Apple Inc., the IBM ® OS/2, Vx Works ® of Wind River Systems, Inc., QNX Neutrino ® developed by QNX Software Systems Ltd., Palm OS ® , the Solaris operating system developed by Sun Microsystems, Inc., the Android operating system, the Windows Phone ® operating system of Microsoft Corporation, the BlackBerry ® operating system of BlackBerry Limited, the iOS operating system of Apple Inc., the SymbianTM operating system of Symbian Foundation Limited, etc.
  • the file format transformation system (FFTS) 502 employs the operating system for performing multiple tasks.
  • the operating system is responsible for management and coordination of activities and sharing of resources of the client device 501.
  • the operating system further manages security of the FFTS 502, peripheral devices connected to the client device 501, and network connections.
  • the operating system employed on the client device 501 recognizes, for example, inputs provided by the users using ' one of the input devices 607, the output display, files, and directories stored locally on the fixed media drive 608.
  • the operating system on the client device 501 executes different computer programs using the processor 601.
  • the processor 601 and the operating system together define a computer system for which application programs in high level programming languages are written.
  • the processor 601 of the client device 501 retrieves instructions defined by the content reception module 502a, the content reflow module 502b, the space and block identification module 502c, the tagging module 502d, the pagination element processing module 502e, the compiler 502f, etc., for performing respective functions disclosed in the detailed description of FIG. 5.
  • the processor 601 retrieves instructions for executing the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 from the memory unit 602.
  • a program . counter determines the location of the instructions in the memory unit 602.
  • the program counter stores a.number that identifies the current position in the computer program of each of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502.
  • the instructions fetched by the processor 601 from the memory unit 602 after being processed are decoded.
  • the instructions are stored in an instruction register in the processor 601. After processing and decoding, the processor 601 executes the instructions, thereby performing one or more processes defined by those instructions.
  • the instructions stored in the instruction register are examined to determine the operations to be performed.
  • the processor 601 then performs the specified operations.
  • the operations comprise arithmetic operations and logic operations.
  • the operating -system performs multiple routines for performing a number of tasks required to assign the input devices 607, the output devices 610, and memory for execution of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the file format transformation system (FFTS) 502.
  • FFTS file format transformation system
  • the tasks performed by the operating system comprise, for example, assigning memory to the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502, and to data used by the FFTS 502, moving data between the memory unit 602 and disk units, and handling input/output operations.
  • the operating system performs the tasks on request by the operations and after performing the tasks, the operating system transfers the execution control back to the processor 601.
  • the processor 601 continues the execution to obtain one or more outputs.
  • the outputs of the execution of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 are displayed to the user on the display unit 606.
  • Disclosed herein is also a computer program product comprising a non-transitory computer readable storage medium having embodied thereon, computer program codes comprising instructions executable by at least one processor 601 for transforming marked-up content in a first file format to a reversible second file format.
  • the computer program product comprises a first computer program code for receiving the marked-up content of the first file format; a second computer program code for reflowing the received marked-up content of the first file format into a continuous page having a configurable page width; a third computer program code for identifying spaces and block elements in the reflown marked-up content of the first file format; a fourth computer program code for generating and appending tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format; a fifth computer program code for determining line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags; a sixth computer program code for tagging the determined line breaks; a seventh computer program code for identifying anchored floats in the reflown marked-up content of the first file format; an eight computer program code for tagging the identified anchored floats; a ninth computer program code for positioning the tagged anchored floats on a current page based on availability of
  • the ninth computer program code positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page.
  • the twelfth computer program code positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page.
  • the computer program product disclosed herein further comprises one or more additional computer program codes for performing additional steps that may be required and contemplated for transforming marked-up content in a first file format to a reversible second file format.
  • a single piece of computer program code comprising computer executable instructions performs one or more steps of the computer implemented method disclosed herein for transforming marked-up content in a first file format to a reversible second file format.
  • the computer program codes comprising computer executable instructions are embodied on the non- transitory computer readable storage medium.
  • the processor 601 of the client device 501 retrieves these computer executable instructions and executes them. When the computer executable instructions are executed by the processor 601, the computer executable instructions cause the processor 601 to perform the steps of the computer implemented method for transforming marked-up content of a first file format to a reversible second file format.
  • FIGS. 7A-7Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible second file format in edit and proof views.
  • the file format transformation system (FFTS) 502 is configured as a software application on a client device 501 exemplarily illustrated in FIG. 5, for example, a personal computer, a laptop, a smart phone, a tablet computing device, etc.
  • a user of the client device 501 may want to edit and review a technical document of, for example, a hypertext markup language (HTML) format that is viewed as a running continuous page.
  • HTML hypertext markup language
  • the user invokes the FFTS 502 on the client device 501 and loads the input HTML document into the FFTS 502.
  • the FFTS 502 allows the user to view the input HTML document via a graphical user interface (GUI) 401 of the FFTS 502.
  • GUI graphical user interface
  • FIG. 7A exemplarily illustrates a screenshot of an opening page of the loaded input HTML document without an edit window 402 in a right pane of the GUI 401.
  • FIG. 7B exemplarily illustrates a screenshot of the opening page of the loaded input HTML document, showing the edit window 402 in the right pane of the GUI 401.
  • the edit window 402 allows the user to edit the input HTML document or accept suggested changes made by other users to the input HTML document in an edit view exemplarily illustrated in FIG. 7B.
  • FIG. 7A exemplarily illustrates a screenshot of an opening page of the loaded input HTML document without an edit window 402 in a right pane of the GUI 401.
  • FIG. 7B exemplarily illustrates a screenshot of the opening page of the loaded input HTML document, showing the edit window 402 in the right pane of the GUI 401.
  • FIG. 7C exemplarily illustrates a screenshot of the output HTML page transformed by the FFTS 502 to a reversible file format, showing a header 701 and a footer 702 entered on the opening page in a proof view.
  • the FFTS 502 positions the marked-up content in an appropriate location close to their respective citations in the proof view.
  • the opening page in the reversible file format can be reversed to the first file format in the edit view.
  • FIG. 7D exemplarily illustrates a screenshot without the edit window 402 in the right pane of the GUI 401, showing hyphenations 703 entered in a page of the input hypertext markup language (HTML) document.
  • FIG. 7E exemplarily illustrates a screenshot showing the hyphenations 703 entered in the HTML page, The user can edit the HTML page using the edit window 402 in the right pane of the GUI 401 exemplarily illustrated in FIG. 7E.
  • the edit window 402 allows the user to edit the hyphenated HTML page.
  • FIG. 7F exemplarily illustrates a screenshot of the output HTML page with hyphenations 703 transformed by the FFTS 502 to a reversible file format, showing the header 701 entered in the hyphenated HTML page in a proof view.
  • FIG. 7G exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing floats, for example, figures 704, without the edit window 402 in the right pane of the GUI 401.
  • FIG. 7H exemplarily illustrates a screenshot of the page of the input HTML document containing the figures 704, showing the edit window 402 in the right pane of the GUI 401.
  • the edit window 402 allows the user to edit the input HTML page containing the figures 704.
  • FIG. 71 exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG.
  • FFTS file format transformation system
  • FIG. 7J exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing a float, for example, a table 706, without the edit window 402 in a right pane of the GUI 401.
  • FIG. 7K exemplarily illustrates a screenshot of the page of the input HTML document containing the table 706, showing the edit window 402 in the right pane of the GUI 401.
  • the edit window 402 allows the user to edit the HTML page containing the table 706.
  • FIG. 7L exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG.
  • FFTS file format transformation system
  • FFTS 502 positions the table 706 in the appropriate location close to a respective citation in the proof view.
  • FIG. 7M exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing footnotes 707, without the edit window 402 in a right pane of the GUI 401.
  • FIG. 7N exemplarily illustrates a screenshot of the page of the input HTML document containing the footnotes 707, showing the edit window 402 in the right pane of the GUI 401.
  • the edit window 402 allows the user to edit the page containing the footnotes 707.
  • FIG. 70 exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701, the footer 702, a page number 705 entered on the page, and the footnotes 707 positioned in the footnote section below a footnote ruler 708 in a proof view.
  • FFTS file format transformation system
  • FIG. 7P exemplarily illustrates a screenshot of an output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701 and the footer 702 at the top and the bottom of the page respectively in a proof view.
  • the output HTML page also contains a page number 705 and a footnote 707 positioned in the footnote section below a footnote ruler 708 in the proof view.
  • FIG. 7Q exemplarily illustrates a screenshot of output HTML pages transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing a page break 709 in a proof view.
  • the FFTS 502 breaks the running continuous input HTML page into individual reversible file format pages containing a header 701 and a footer 702, and renders the output on the GUI 401.
  • FFTS file format transformation system
  • Non-transitory computer readable media refers to non-transitory computer readable media that participate in providing data, for example, instructions that may be read by a computer, a processor or a similar device.
  • Non-transitory computer readable media comprise all computer readable media, for example, non-volatile media, volatile media, and transmission media, except for a transitory, propagating signal.
  • Non- volatile media comprise, for example, optical discs or magnetic disks and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitutes a main memory.
  • DRAM dynamic random access memory
  • Volatile media comprise, for example, a register memory, a processor cache, a random access memory (RAM), etc.
  • Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to a processor, etc.
  • Common forms of computer readable media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a laser disc, a Blu-ray Disc ® of the Blu-ray Disc Association, any magnetic medium, a compact disc-read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, punch cards, paper tape, any other physical medium with patterns of holes, a random access memory (RAM), a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other memory chip or cartridge, or any other medium from which a computer can read.
  • RAM random access memory
  • PROM programmable read only memory
  • EPROM erasable programmable read only memory
  • EEPROM electrically erasable programmable read only memory
  • flash memory any other memory chip or cartridge, or any other medium from which a computer can read.
  • the computer programs that implement the methods and algorithms disclosed herein may be stored and transmitted using a variety of media, for example, the computer readable media in a number of manners.
  • hard- wired circuitry or custom hardware may be used, in place of, or in combination with, software instructions for implementation of the processes of various embodiments. Therefore, the embodiments are not limited to any specific combination of hardware.
  • the computer program codes comprising computer executable instructions may be implemented in any programming language that runs on an internet browser, for example, ChromeTM of Google Inc., Firefox ® of Mozilla Foundation, Safari ® of Apple Inc., Internet Explorer ® of Microsoft Corporation, etc., on any operating system.
  • the computer program codes or software programs may be stored on or in one or more mediums as object code.
  • FFTS file format transformation system
  • HTML hypertext markup language
  • XML extensible markup language
  • GUI graphical user interface
  • FFTS 502 may be implemented as programmed elements, or non-programmed elements, or any suitable
  • the computer program product disclosed herein comprises one or more computer program codes for implementing the processes of various embodiments.
  • the computer implemented method and the FFTS 502 disclosed herein are not limited to a particular computer system platform, processor, or operating system.
  • the foregoing examples have been provided merely for the purpose of explanation and are in no way to be construed as limiting of the computer implemented method and the file format transformation (FFTS) system 502 disclosed herein. While the computer implemented method and the FFTS 502 have been described with reference to various embodiments, it is understood that the words, which have been used herein, are words of description and illustration, rather than words of limitation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A method and a file format transformation system (FFTS) for transforming marked-up content in a first file format (FFF) to a reversible second file format (RSFF) are provided. The FFTS reflows marked-up content of the FFF into a continuous page. The FFTS generates and appends tags to spaces and block elements identified in the reflown marked-up content of the FFF. For each space and block element, the FFTS determines and tags line breaks in the reflown marked-up content. For each line break, the FFTS identifies, tags, and positions anchored floats and footnotes on a current page based on space availability. The FFTS positions page breaks in the continuous page based on a configurable page height and the line breaks. The FFTS groups the marked-up content, inserts pagination elements, for example, page numbers, etc., and renders the grouped marked-up content in the RSFF, which is reversible to restore the continuous page.

Description

TRANSFORMATION OF MAR ED-UP CONTENT TO A REVERSIBLE FILE FORMAT FOR AUTOMATED BROWSER BASED PAGINATION
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a PCT application that claims priority to and the benefit of non- provisional patent application number 3348/CHE/2015 titled "Transformation Of Marked-up Content To A Reversible File Format For Automated Browser Based Pagination", filed in the Indian Patent Office on 01 July 2015. The specification of the above referenced patent application is incorporated herein by reference in its entirety.
BACKGROUND
With the increase in internet usage and applications, users are now accessing information and searching for information online. Information on the web is typically represented through electronic documents created using markup languages. Electronic documents created using markup languages are easily accessible to users through a typical web browser. A typical markup language document is made of different types of content, for example, textual content, images, videos, etc., and carries syntax information that instructs a browser how to render different types of content in the markup language document to a user. The syntax information comprises a set of markup language tags that are executed on the browser. Furthermore, rendering a document on a browser can be controlled, for example, by using cascading style sheets (CSS) that describe the formatting of a document written in a markup language. A CSS document is typically attached, embedded, or linked to a markup language document. The CSS defines how each element, for example, font size of text, color of a background or text, position and alignment of content elements, etc., in the markup language document appears on the browser.
Conventional markup language documents are typically displayed as continuous running documents without any page breaks. These continuous running documents are not print-friendly. A typical markup language document can accommodate a large amount of content, whereas a standard print ready page has, for example, 8.5" x 1 1 " dimensions with margin's that reduce the space available for accommodation of a large amount of content during a print operation. The content has to be broken at two levels, that is, a horizontal level or page width and a vertical level or page height. The page width relates to a line break, and the page height relates to a page break. Content rendering on a browser can have loose lines, and spaces are often distributed in ways that make a page appear to have rivers of blanks flowing through the page. How the browser renders this content has to be understood in order to meaningfully interpret the content subsequently. Line breaks rendered by the browser can be discerned as belonging to four different types, namely, word space breaks (wsbr), soft hyphen breaks (wshbr), hard breaks (wbr), and para breaks (wsp). Word space breaks are discerned by finding which spaces are quashed to a zero width. The word space breaks are then interpreted as the end of a line or a line break. Similarly, for manually introduced soft hyphens, if a line breaks in a soft hyphen, then the soft hyphen attains a non-zero width which is also interpreted as the end of the line or as a line break. A hard line break can be discerned when an offset decrease is encountered. Therefore, any markup language content that falls outside a printing area needs to be resized and repositioned accordingly for an optimal print output without losing any data when a print operation is performed. One method for printing continuous running pages involves introducing page breaks based on a vertical height equal to a page of printing media upon which the content is to be printed. The problem with relying on introducing page breaks based on the vertical height is that text lines and other content are disrupted in between a page and the same is printed. There are additional problems, for example, numbering the pages as page numbers are forced and not based on the content, page layout issues on print media and on handheld devices, etc. Floats such as images and tables can split and spill across pages and trying to avoid these can result in large vertical gaps, making the presentation undesirable.
Content in a document can be easily read by a computer when the content is marked up. In markup language documents, for example, hypertext markup language (HTML) documents, word spaces and line breaks are not explicitly tagged. The word spaces and the line breaks _remain.anonymous,„for_example,-as-genemc word-spaces and-line-breaks, and-hence-are-diffieult- to read and understand for printing accurately. With the advent of handheld devices, for example, smartphones, tablets, etc., there is a need for an optimized rendering of markup language documents and hence the concept of a fluid page was originated. The non-print-friendly documents, page numbering issues, and other page layout problems still exist in fluid pages. There is a need for bridging fluid web-content and fixed-page typesetting originating as a fluid HTML, without a reference printer at the destination. Markup language documents are typically interactive and dynamic in nature, whereas the print is essentially static in nature. For example, hypertext markup language (HTML) documents contain free flowing or reflowing content. Images, paragraphs, videos and other similar content are arranged in an HTML document as tags. HTML documents are adaptable to different devices. That is, if an HTML document is viewed in a web browser, then the HTML document adapts to the web browser and displays content of the HTML document as per the specifications of the web browser. If this HTML document is viewed on a mobile browser of a mobile device, then the HTML document adapts to the specifications of the mobile browser. However, the HTML content is not suitable to print. Since the HTML content is not fixed, a printer would interpret specific elements of the HTML content inaccurately and therefore print the HTML content inaccurately. While there are many transformation techniques and file formats, these file formats are not reversible and do not restore fluidity of the transformed markup language documents. One of the main reasons that the fluidity cannot be restored is that the page output in non-reversible file formats are defined graphically as a set of printer instructions at a glyph level that lose structural information at a character level and a content level.
Markup language content and associated content elements are interpreted and defined using markup language tags on any standard web browser. The tags included in a markup language document are typically executed on a server or on a web browser. Scripts or tags that run directly on a web browser have less latency time compared to a server side execution of tags. Moreover, a server side execution of tags requires an active network connection, whereas a client side execution of web browser compatible tags runs without an active network connection. Most textual markup language documents are rendered in a client-server architecture, where there are delays and additional communication cost between a server and a user's client device for presenting and printing markup language documents. Pagination of a hypertext markup language (HTML) document involves partitioning content of the HTML document and presenting the partitioned content on individual pages. Conventional solutions include pagination of HTML documents based either on cut-off markers or the number of items to be displayed per page. These solutions are typically implemented using server side technologies. There is a need for a client side implementation, and there have been a few attempts at client side pagination due to the improved performance that the client side pagination can yield.
US Patent No. 7,647,553 B2 provides a hypertext markup language view template that allows a hypertext markup language content document to flow into a series of containers. This is performed by identifying the layout of the hypertext markup language document by using view templates. In this method, a hypertext markup language authorship is provided that takes a bottomless continuous running hypertext markup language page and positions the content in a series of predefined containers within the display media. The content is flowed into the predefined containers. This method does not handle the positioning of footnotes on the same page where respective footnote citations reside, which makes it difficult for a user to refer to citations. This method also does not place floats proximate to their corresponding citations, which makes it difficult for the user to access floats corresponding to the citations. Furthermore, this method does not address header and footer conversion issues.
US Patent No. 6,789,229 Bl addresses issues with pagination that involves more processor intensive tasks. This method uses pagination techniques that involve determining reproducible pages followed by numbering individual pages based on hard breaks. This method requires a predetermined list of hard breaks occurring in the document being processed which requires a lot of processing time to display page numbers and therefore, there is a need for a -; faster and efficient technique to process page numbers.
A publication by Hewlett-Packard Laboratories titled "Automatic Pagination of HTML Documents in a Web Browser" discloses automatic pagination of hypertext markup language (HTML) documents on the client side. The methods disclosed in this publication utilize a built-in library of JavaScript® functions in a browser and size attributes to format an HTML page. The paginations are performed through extensible stylesheet language transformation (XSLT). These pagination techniques render page numbers in tabs which occupy more space if the number of pages is large. These methods do not handle page numbers when a print operation is initiated. Moreover, these methods do not position floats and footnotes on the same page where their respective citations reside. These methods transform a regular HTML page into individual pages with-paginated tabs,_but do -no.Leffici.entLy_handle a ..jo.urnaLar_a-nove.l .sty.Le_HT_ML_page._whi.ch translates to hundreds or even thousands of individual pages.
Conventional file formats, for example, the portable document format (PDF) of Adobe Systems Incorporated and the electronic publication (ePub®) format of Open eBook Forum DBA are two typical file formats used in documentation. The portable document format is based on a fixed layout and does not support a fluid layout. Page numbers in the portable document format are forced and not based on the content. The ePub file format is designed with reflowable content, which can optimize text and graphics according to a display device. However, the ePub file format does not support header and footer at a conversion stage, places floats at random locations, and does not proxy floats, for example, videos and long tables to a linked source, thereby hindering the user experience.
Hence, there is a long felt but unresolved need for a computer implemented method and a file format transformation system deployed on a client device that transforms marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format that can be stored offline, executed with less latency and without an active network connection on any browser on any operating system, and can be restored to a continuous page. Moreover, there is a need for a computer implemented method and a file format transformation system that implements document tagging of all content including spaces and line breaks to transform fluid pages to fixed pages that are print-friendly and provide a fixed page view that captures document elements, for example, line breaks, floats, footnotes or end notes, page numbers, headers and footers, captions, etc., which are expressed relationally and assigned page appropriate placement. Furthermore, there is a need for a computer implemented method and a file format transformation system that position floats and footnotes on the same page where their respective citations reside, support headers and footers at a conversion stage, place floats at appropriate locations, and proxy floats, for example, videos and long tables to a linked source, thereby enhancing the user experience.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts in a simplified form that are further disclosed in the detailed description of the invention. This summary is not intended to identify key or essential inventive concepts of the claimed subject matter, nor is it intended for -determining the scope-of the claimed subject matter..
The method and the file format transformation system (FFTS) disclosed herein address the above stated need for transforming marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format that can be stored offline, executed with less latency and without an active network connection on any browser on any operating system, and can be restored to a continuous page. Moreover, the method and the FFTS disclosed herein implement document tagging of all content including spaces and line breaks to transform fluid pages to fixed pages that are print-friendly and provide a fixed page view that captures document elements, for example, line breaks, floats, footnotes or end notes, page numbers, headers and footers, captions, etc., which are expressed relationally and assigned page appropriate placement. In the absence of tags for blanks, for example, word spaces and line breaks, it would be difficult to instruct and cajole a browser to reflow content, thereby limiting the scope to the browser's default content flow. Consequently, a meaningful page break cannot be assigned and scripts that interpret the tags to produce page breaks do not have handles with which to produce the page breaks. In the reversible second file format disclosed herein, word spaces, line breaks, and page breaks are explicitly tagged. The FFTS therefore generates fixed format virtual pages in the reversible second file format with line breaks and page breaks placed in appropriate locations within a continuous document. In a virtual page rendering, the continuous page is first rendered with demarcation lines or page borders for page breaks. A cascading style sheet (CSS) instruction is provided to a printer to not print the demarcation lines but interpret them instead as page breaks.
The client side implementation of the method and the file format transformation system (FFTS) disclosed herein allows a user of a document to be presented with an alternate presentation of the document without additional communication costs between a server and the user's client device. Moreover, the client side implementation of the method and the FFTS disclosed herein enables automated browser based pagination of markup language documents, for example, hypertext markup language (HTML) documents based on the dimensions of a web browser's window and the rendered size of components. The reversible file format allows a user to view the page-broken document as a^continuous document on a browser. The user can switch between the two views. The computer implemented method and the FFTS disclosed herein position floats and footnotes on the same page where their respective citations reside, support headers and footers at a conversion stage, place floats at appropriate locations, and proxy floats, for example, videos and long tables to a linked source, thereby enhancing he user experience. The computer implemented method disclosed herein is minimalistic in terms of document object model (DOM) manipulation and performs minimum manipulation to create pages.
The computer implemented method disclosed herein employs the file format
transformation system (FFTS) deployed on a client device comprising at least one processor configured to execute computer program instructions for transforming marked-up content in a first file format to a reversible second file format. The FFTS receives the marked-up content of the first file format. The FFTS reflows the received marked-up content of the first file format into a continuous page having a configurable page width. The FFTS identifies spaces and block elements in the reflown marked-up content of the first file format. The FFTS generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format. For each of the identified spaces and the identified block elements, the FFTS determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags and tags the determined line breaks. For each of the determined line breaks, the FFTS identifies anchored floats, for example, figures, tables, images, videos, etc., in the reflown marked-up content of the first file format and tags the identified anchored floats. The FFTS positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page. The FFTS identifies footnotes in the reflown marked-up content of the first file format and tags the identified footnotes. The FFTS positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page. The FFTS positions page breaks in the continuous page based on a configurable page height and the determined line breaks for the positioning of the tagged anchored floats and the tagged footnotes on a subsequent page on non-availability of the space on the current page. The FFTS groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page. The FFTS inserts one or more of multiple pagination elements, for examples, page numbers, a header, a footer, etc., on each page containing the grouped marked-up content. The FFTS renders the grouped marked-up content with the inserted pagination elements in the reversible second file format. The method disclosed herein performs tagging of the spaces and the block elements with <span data-ph5="ws"> and < span data-ph5="wsp">, and tagging the line breaks with <span data-ph5="wsbr">, the para breaks with <span data-ph5="wsp">, etc. The rey-ersible-second_file_format allo_ws_the marked-up„content. to be_re_versed.to the.first file- format, restoring continuity, for example, by converting <span data-ph5="wsbr"> and <span data-ph5="wsp"> back to <span data-ph5 = WS>. The data-ph5 attribute described above pertains to hypertext markup language5 (HTML5). For backward compatibility with HTML 4, the "class" attribute can be used instead of the data-ph5 attribute. It may be noted that "class"' attribute expressions in legacy HTML impose certain limitations to reversibility compared to the data-ph5 attribute in HTML5. In one or more embodiments, related systems comprise circuitry and/or programming for effecting the methods disclosed herein; the circuitry and/or programming can be any
combination of hardware, software, and/or firmware configured to effect the methods disclosed herein depending upon the design choices of a system designer. Also, various structural elements may be employed depending on the design choices of the system designer.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing summary, as well as the following detailed description of the invention, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the invention, exemplary constructions of the invention are shown in the drawings.
However, the invention is not limited to the specific methods and components disclosed herein.
The description of a method step or a component referenced by a numeral in a drawing is applicable to the description of that method step or component shown by that same numeral in any subsequent drawing herein.
FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a reversible second file format. FIG. 2 exemplarily illustrates an interpretation of marked-up content in a reversible second file format.
FIGS. 3A-3F exemplarily illustrate a flowchart comprising the steps performed by a file format transformation system for transforming marked-up content in a first file format to a reversible second file format.
EIGS.-4A=4B„exemplarilyJllustrate_sereensho^
FIG. 4C exemplarily illustrates a screenshot showing a proof view of the marked-up content rendered in a reversible second file format.
FIG. 4D exemplarily illustrates a screenshot showing a source code of the marked-up content rendered in a reversible second file format. FIG. 5 illustrates a system comprising a file format transformation system deployed on a client device for transforming marked-up content in a first file format to a reversible second file format.
FIG. 6 exemplarily illustrates the hardware architecture of a client device that deploys the file format transformation system for transforming marked-up content in a first file format to a reversible second file format.
FIGS. 7A-7Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible second file format in edit and proof views.
DETAILED DESCRIPTION OF THE INVENTION
FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a reversible second file format, hereafter referred to as a
"reversible file format". As used herein, "marked-up content" refers to content having markups or appended tags that indicate the type of content, for example, a header, a footer, a caption, a table, a figure, an image, a video, a line break, etc. As used herein, "line break" refers to a pagination element representing the end of a line of text. Also, as used herein, "reversible file format" refers to a file format that can be back transformed into the first file format. The reversible file format disclosed herein is named, for example, as "PH5" that represents pagination with hypertext markup language 5 (HTML5) and comprises a set of properties including tags that are generated in accordance with structural semantics of documents in the first file format, for example, hypertext markup language (HTML) documents, and recognizes scripts that shape the PH5 output. The scripts that shape the PH5 output vary.
The computer implemented method disclosed herein employs a file format transformation system (EETS)-deployed on-axlientdevice omprising at least one^rocessor-configured-to- execute computer program instructions for transforming marked-up content in a first file format to a reversible second file format. The client device is a computing device, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, etc. The FFTS converts web content seamlessly using document tagging. The file format transformation system (FFTS) receives 101 marked-up content of a first file format, for example, a hypertext markup language (HTML), an extensible hypertext markup language format
(XHTML), etc. The FFTS receives document contents, for example, in the HTML format. In an embodiment, the first file format is an extensible markup language (XML). In this embodiment, the FFTS converts a document from the XML format to an HTML format and then transforms the mark-up content in the HTML format to the reversible file format. A browser that loads the marked-up content of the first file format inserts code points, for example, soft hyphens in the marked-up content of the first file format based on dictionary elements, for example, dictionary syllables such as - im-por-tant, con-se-quence, ap-pear-ance, etc. As used herein, a "soft hyphen" refers to a code point reserved in coded character sets used for breaking words across lines by inserting visible hyphens. Unicode defines the soft hyphens as invisible characters that allow a manual specification of a position where a hyphenated break is allowed without forcing a line break in an inconvenient place if the content or text is later reflowed. The FFTS reflows 102 the received marked-up content of the first file format into a continuous page having a configurable page width. As used herein, the term "reflow" refers to a browser process of recalculating positions of HTML elements in the HTML content and re-rendering the HTML elements with new positions.
The file format transformation system (FFTS) identifies 103 spaces and block elements in the reflown marked-up content of the first file format. In an embodiment, the FFTS identifies existing break elements, for example, hard breaks such as soft hyphen breaks, line breaks, and para breaks in the reflown marked-up content of the first file format. The FFTS also identifies unanchored or uncited floats in the reflown marked-up content of the first file format. The block elements are content elements that create blocks or large groupings of content and generally begin new lines of text. The block elements expand to fill a parent container containing text, inline elements, etc., and can have margins and/or padding, fitting the child elements. The <div> element is a block element in the hypertext markup language (HTML). The block elements, for example, (<div>, <hl> - <h6>, <p>) in a document start on a new line and take up the full width available. The FFTS generates and appends 104 tags to the identified spaces and the identified -block elements-in the-reflown-rnarked=up-content ; of-the-first-file-format.^Ehe-FETS generates- tags in accordance with structural semantics of the marked-up content, which then helps the scripts recognize the tags. The FFTS replaces the identified word spaces, for example, with <span data-PH5 = WS>, where the term "span" is a tag used to group inline elements, for example, <a>, <img>, etc., in the HTML that do not start on a new line and only take up a necessary width. As used herein, "word space" refers to a single space between two words. The FFTS tags the identified block elements, for example, as <div class WSP>, where "WSP" refers to para break. In an embodiment, floats and footnotes have prior representation in an input document of the first file format, for example, the HTML format and need no specific tagging. As used herein, "floats" refer, for example, to images, videos, audio content, tables, figures, etc., that float unhinged from the main content flow, except in their relationship to their citations as available in the input document. Also, as used herein, the term "footnotes" refers to content that is intended to be placed at the bottom of a page and used to cite references to content on the page. Image floats have, for example, <img> tags. Table floats can be recognized by the presence of various tag elements, for example, <td>, <tr>, etc. Footnotes are in a number series and are shown as superscript <sup> numbers that are assigned to specific locations in the main content flow, and these superscripts reference notes appended to the main content, for example, at the bottom in a continuous page.
For each of the identified spaces and the identified block elements 105, the file format transformation system (FFTS) determines 106 line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags
exemplarily illustrated in FIG. 3C, and tags the determined line breaks. The line breaks retain integrity of the reversible second file format by hyphenating and adjusting spaces in the marked- up content rendered in the reversible file format. In an embodiment, the FFTS identifies the line breaks through JavaScript® developed by Sun Microsystems, Inc. . For each of the determined line breaks 107, the file format transformation system (FFTS) identifies 108 anchored floats in the reflown marked-up content of the first file format and tags the identified anchored floats. The FFTS positions 109 the tagged anchored floats on a current page based on availability of space for the identified anchored floats on the current page. The FFTS positions the tagged anchored floats proximal to associated float citations on the current page based on availability of space for the tagged anchored floats on the current page. The FFTS identifies 110 footnotes in the reflown marked-up content of the first file format and tags the -identified-footnotes. The-EEES-places-the-footnotes initially-as-"line-notes— immediately_below the cited line, works out the available space after flowing the main text, and then reflows the footnote to the bottom of the same page. The FFTS positions 111 the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page. The FFTS positions the tagged footnotes proximal to associated footnote citations on the current page based on availability of space for the tagged footnotes on the current page. The FFTS positions page breaks 112 in the continuous page based on a configurable page height and the determined line breaks for the positioning of the tagged anchored floats and the tagged footnotes on a subsequent page on non-availability of space on the current page. As used herein, "page break" refers to a marker that indicates that content which follows the marker is part of a new page. The FFTS groups 113 the marked-up content with the positioned anchored floats and the positioned footnotes on each page. The FFTS inserts 114 one or more of multiple pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content.
The FFTS tags the identified word spaces, for example, as <span data-ph5="ws">. The FFTS tags the line breaks, for example, as <span data-ph5="wsbr">. The FFTS represents the lines ending with hyphenations, for example, as <span data-ph5="wshbr">. At the end of every paragraph in the re flown marked-up content, the file format transformation system (FFTS) introduces a paragraph break. As used herein, "paragraph break" refers to a pagination element representing the end of a paragraph. The paragraph break is a non-intrusive data model that preserves an original data model of the hypertext markup language (HTML). The FFTS represents the paragraphs, for example, as <p>, <div>, etc., and appends appropriate tags, for example, <div data-ph5="wsp"> to the paragraphs. The FFTS tags the paragraph breaks, for example, as <div data-ph5="wsp">.
The file format transformation system (FFTS) positions the floats, for example, figures, tables, text boxes, etc., closer to anchors within the available space. Where anchors are not available, the FFTS appends anchors at the input location of the float. The FFTS represents the floats, for example, as <div data-ph5="fioat"> with a relevant identifier (id) attribute. The corresponding anchors are represented as <span data-ph5="float-anchor"> with a "refid" attribute matching the "id" attribute value of the corresponding float. The FFTS initially positions the floats near their anchors and then moves the floats to the bottom or top of the current page, or to one of the following pages according to the availability of space similar to footnotes. The -F-ETS--positions-floats,-for-example, images,-tables, extboxes,-pul outs,-etc., in- proximity to the anchor and ensures that grouped elements such as captions for the floats, if any, appear immediately before or after the floats, and that the captions are not widowed or orphaned. The FFTS handles the grouped elements comprising, for example, a float and a caption associated with the float in the reversible file format at a position assigned in the marked-up content of the first file format to the float. The file format transformation system (FFTS) declares uniform resource locater (URL) breaks to a paging engine. The FFTS couples expressions such as footnotes to page breaks. The page break breaks a web page into a predefined length and delivers cut pages, while ensuring headings and words at the beginning and end paragraphs are not widowed or orphaned. The FFTS introduces page breaks when a script cookie cuts the fluid page to a reference dimension. The FFTS introduces a page break tag, for example, <div data-ph5="wspbr"> to the appropriate line break. The FFTS initially positions footnotes next to the corresponding citations. The FFTS moves the footnotes to the footer section of the page after introduction of the page breaks. The FFTS tags the footnotes, for example, as <div data-ph5- 'footnote">, where the first footnote comprises an additional class called "firstFootnote" and the rest of the footnotes comprise an additional class called "notFirstFootnote". The FFTS numbers the footnotes and positions the footnotes at the bottom of the relevant page.
The file format transformation system (FFTS) inserts page numbers, a header, a footer, a footnote ruler, fillers, etc., or any combination thereof in one or more pages in the reversible second file format. The FFTS inserts page number tags, for example, <div data-ph5="page- ·■■'■ number"> in the line breaks. The FFTS inserts page numbers on the pages based on a predefined numbering style. The FFTS inserts the footnote ruler, for example, as a horizontal line to separate running text and the footnotes. The FFTS tags the footnote ruler, for example, as <div data-ph5- 'footNoteRuler">. The FFTS allows the footnote ruler to be tweaked on and off in the cut pages. For fixed page rendering, the FFTS uses filler compensation for eliminating orphans, widows, and divorce between couples, for example, section heading and paragraph, figure and table, table heading and table, etc. The FFTS represents the fillers, for example, as <span data- ph5="fillerText">. The FFTS automatically deploys fillers, for example, line spaces, if needed, to fill to a page to increase aesthetics.
The-file-format_transformation-system_(FFTS) renders J15 he_grouped.marked-up_ content with the inserted pagination elements in the reversible file format. The FFTS compiles and positions the reflown marked-up content and the pagination elements with associated properties at predetermined context based positions across multiple pages based on page dimensions and the appended tags. The FFTS performs hyphenation and justification in the rendered marked-up content in the reversible second file format to provide kerning based on aesthetics, for example, for avoidance of loose lines and blank rivers. The reversible second file format allows the marked-up content to be reversed to the first file format to restore the. continuous page. The rendered marked-up content in the reversible second file format is) accessible on one or more multiple browsers on multiple operating systems. The fixed page in the reversible file format to which the marked-up content in the first file format is transformed is expressed, for example, as a pixel dimension equivalent of a paper size or a device size. The data model of the reversible file format, for example, referred to as the PH5 format transforms a fluid page, for example, in a hypertext markup language (HTML) format to a fixed page, for example, in the reversible file format or the PH5 format, where the transformation is reversible. That is, the FFTS interprets a fluid page and delivers a fixed page. The tagged input allows the transformation of a fluid page to a fixed page. The enriched inheritance comprises page breaks. The other elements are defined in terms of the page breaks. The extension of the fixed page in the PH5 format is, for example, .PH5. The FFTS bridges fluid web-content and fixed-page typesetting, originating as a fluid HTML, without a reference printer at the destination. The PH5 format is similar, for example, to a zip file format such as an electronic publication (ePub) format and can be opened in a common browser on any operating system in a fixed page view. A PH5 file can be back-transformed into a standard hypertext markup language (HTML) file from which the PH5 file was generated with the fluidity of the HTML file restored.
In the PH5 data model, the file format transformation system (FFTS) performs document intelligence tagging. Tagging the spaces or blanks effects visible content for emulation and standardization. In the PH5 file format, line break candidates are identified and marked up as page breaks. With this method, implicit statements in the document are understood and tagged for downstream machine reading or paging. The transformation from a fluid file format to the reversible file format, that is, the PH5 file format is accomplished subject to the availability of a tag set that exposes an understanding of document semantics to scripts that generate the PH5 package. Creation of the tag set allows creation of a fixed page view that captures document elements that are expressed relationally and that are then assigned page-and-context-appropriate placemen and styling.-A^PH5-file,- as a portable document, anticipates Jhe.tag. set in_a_work_ queue and defines a standard for creating the same. The PH5 files do not need reference printers, driver installations, configuration of printer settings, etc., and also do not need a reader application or a browser plug-in. Furthermore, the PH5 files allow offline storage of information.
FIG. 2 exemplarily illustrates an interpretation of marked-up content in a reversible second file format, herein referred to as a PH5 format. A typical HTML page does not have tags specified for spaces. The HTML page comprises a header, a footer, footnotes, floats such as figures, tables, images, video, audio, etc. The file format transformation system (FFTS) loads a hypertext markup language (HTML) page with associated cascading style sheets and transforms the HTML page to a PH5 page 201 of the PH5 format as exemplarily illustrated in FIG. 2.
During the transformation, the FFTS identifies word spaces and block elements in the HTML page and appends the identified word spaces and the identified block elements with appropriate PH5 format tags. For example, the FFTS tags each word space with a tag <span data-ph5="ws"> and appends a tag <div data-ph5="wsp"> at the end of every paragraph. The FFTS identifies line breaks using JavaScript® and tags the identified line breaks, for example, with a tag <span data- ph5="wsbr">. Further, the FFTS tags the line breaks that end with a hyphen, for example, with a tag <span data-ph5="wshbr">. The FFTS system performs tagging without replacing the original HTML tags, thereby preserving the original HTML tags to allow the final output reversible file format to be reverted back into HTML page, if a user wants to suppress the changes and revert back to the HTML page. The tagged HTML page, that is, the PH5 page 201 exemplarily illustrated in FIG. 2, contains all the content of the original input HTML page along with the PH5 format tags. The tagging process allows the FFTS to transform a fluid HTML page into; fixed HTML pages. A fluid HTML page contains responsive content elements that resize their position and geometry according to a web browser width.
The file format transformation system (FFTS) further introduces a page break tag <div data-ph5="wspbr"> next to an appropriate line break with reference to dimensions of the page. The FFTS inserts a page number tag <div data-ph5="page-number"> at the bottom of the page. The FFTS positions any available footnotes proximate to a respective citation and once the page breaks are introduced, the FFTS tags the footnotes <div data-ph5- 'footnote"> and positions the footnotes at the bottom of the page. The FFTS places an additional class tag after the first footnote "firstFootnote" and tags the following footnotes with an additional tag
"notFirstFootnote" to differentiate between the first footnote and the following footnotes. The FFTS system introduces a horizontal line to separate the main content from the footnote matter and tags the horizontal line as <div data-ph5="footNoteRuler">. The FFTS tags floats, for example, "FIG 1" and "FIG 2" exemplarily illustrated in FIG. 2, as <div data-ph5="float"> with a relevant "id" attribute and tags corresponding anchors as <span data-ph5="float-anchor"> with a "refid" attribute matching the "id" attribute value of the corresponding float. The FFTS renders the PH5 page 201 with the PH5 tags disclosed above. The FFTS creates headers and footers using auto-generated content wrapped in <div data-ph5 = "page-header"> and <div data-ph5 = "page-footer"> respectively, with sub-elements for left, right, or center positioning. The file format transformation system (FFTS) performs PH5 tag recognition for automated browser based pagination and generates output pages 202. The FFTS recognizes the PH5 format tags appended in the PH5 tagged hypertext markup language (HTML) page 201. As exemplarily illustrated in FIG. 2, the PH5 tagged HTML page 201 comprises two figures labeled as "FIG 1" and "FIG 2" along with another float. During the tag recognition process, the FFTS encounters the float tag of the first float, that is, FIG 1, in the PH5 tagged HTML page 201 and positions "FIG 1 " proximal to the corresponding citation until the FFTS encounters a page break tag. When a page break tag is encountered based on the availability of space, the FFTS positions the float "FIG 1" at the top or bottom of the page close to the respective citation. The FFTS then allows the reflow of the HTML content to fit in the specified page width. The FFTS, upon recognizing, the footnote tag introduces the footnote matter at the bottom of the page in close proximity to the respective citation. The FFTS introduces a footnote ruler to separate the main content from the footnote matter upon recognition of the footnote tag. The FFTS further encounters a page number tag and introduces a page number at the bottom of the page after the footnote matter. The FFTS then breaks the page into an individual page after encountering the page break tag which is placed based on the reference page height.
The file format transformation system (FFTS) then proceeds to the next section after the page break tag, proxies "FIG 2" and other floats, for example, audio, video, tables, etc., to a linked source, positions these floats according to the availability of space, positions page breaks according to the pixel dimension of the page, and inserts a page number for the current page. The FFTS then proceeds to the next section after the page break tag, positions the remaining footnotes on the next page, and inserts a page number for the next page. The FFTS performs the page transformation process until the last page break tag is recognized.
FIGS. 3A-3F exemplaril-y-iJlustrate-a_flowchart comprising-the-steps-performed by the. file format transformation system (FFTS) for transforming marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format, hereafter referred to as a "reversible file format". As exemplarily illustrated in FIG. 3A, the FFTS loads 301 the HTML content with cascading styling sheets (CSS) in a browser and examines 302 the loaded HTML content. The FFTS analyzes and describes syntactic roles of the HTML content. The FFTS introduces 303 hidden code points, for example, soft hyphens into the HTML content based on popular dictionary elements, for example, dictionary syllables. The FFTS then reflows 304 the HTML content to fit a desired page width with a running continuous page height. The reflow process is used in a markup language document to render the markup language document to different types of user devices. The FFTS identifies 305 spaces between words in the reflown HTML content and replaces 306 each of the spaces with a tag, for example, <span data-PH5 = WS> tag, where "WS" refers to word space. The FFTS performs word spacing according to a kerning of a selected font. The FFTS also identifies 307 block elements in the reflown HTML content and introduces 308 a tag, for example, a <div class WSP> tag for each of the identified block elements in the reflown HTML content, where "WSP" refers to word space paragraph. As used herein, the term "<div>" refers to a markup language tag that defines a container for holding content elements.
After tagging, the file format transformation system (FFTS) iteratively processes the generated tags and identifies, for each of the identified spaces and the identified block elements, one or more pagination elements in the reflown HTML content. In this example, the FFTS identifies pagination elements such as line breaks, floats, and footnotes as exemplarily illustrated in FIGS. 3C-3E respectively.
FIG. 3B exemplarily illustrates iteration steps performed by the file format
transformation system. The file format transformation system (FFTS) iterates 309 the steps of determining and assignment of line breaks for every occurrence of the <WS> tag and the
<WSP> tag as exemplarily illustrated in FIG. 3C, until all the <WS> tags and the <WSP> tags are processed 310. The file format transformation system (FFTS), after processing all the <WS> and <WSP> tags, iterates 311 all the line breaks and then proceeds to the steps exemplarily illustrated in FIG. 3D-3E.
FIG. 3C exemplarily illustrates determination and assignment of line breaks at appropriate-positions-in-the-reversible-file format._The EETS determines_and assigns line breaks^ upon encountering any one of the following conditions: If the word space <WS> equals zero, then the FFTS identifies 312 the word space as a line break; if a soft hyphen <SHY> is not equal to zero, then the FFTS identifies 313 the soft hyphen as a line break; and if the FFTS identifies a paragraph break, the FFTS forces 314 a line break. After assigning the line breaks, the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIG. 3B. FIG. 3D exemplarily illustrates positioning of floats proximate to a first citation in the reflown HTML content. The file format transformation system (FFTS) identifies 315 where the floats are cited in the reflown HTML content and checks 316 whether a current page can accommodate one or more floats. If the current page cannot accommodate one or more floats, the FFTS positions 317 one or more floats into the next available page proximate to their respective citation. If the current page can accommodate one or more floats, the FFTS inserts 318 the floats on the current page. For example, for a page with 500 pixels of fixed height, the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, while keeping track of the pixels covered. If the FFTS encounters a float before the 500 pixel height, the FFTS analyzes the float pixel dimension and the pixels covered so far and determines the sum of the float pixel dimension and the pixels covered till the point where the float was cited. If the sum exceeds the specified page height, for example, 500 pixels, the FFTS inserts the float on the next available page after the page break, in a way that the float follows the citation but does not precede the citation, and if the sum of the float pixel dimension and the pixels covered till the point where the float was cited is less than the specified page height, then the. float is inserted on the same page proximate to its citation. After positioning of the floats proximate to a first citation in the reflown HTML content, the FFTS proceeds to the steps exemplarily illustrated in FIG. 3F. FIG. 3E exemplarily illustrates positioning of footnotes at relevant pages in the reversible file format. The file format transformation system (FFTS) identifies footnotes in the reflown hypertext markup language (HTML) content and checks 319 whether space is available for a footnote cited in the reflown HTML content. If space is not available in the current page, the FFTS positions 320 the footnote, that is, the citation point's sentence and matter, on the next page. If there is enough space available in the current page, the FFTS positions 321 the footnote matter on a page footnote section as the footnote is cited. For example, for a page with 500 pixels of fixed height, the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, keeping track of pixels covered. If the FFTS encounters a footnote citation before the 500 pixel height, the FFTS analyzes the corresponding footnote pixel dimensions and the pixels covered so far and determines the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited. If the sum exceeds the specified page height, for example, 500 pixels, the FFTS accommodates the footnote along with its citation on the next available page after the page break, and if the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited is less than the specified page height, then the footnote is accommodated proximate to its citation in the same page at the bottom. After positioning the footnotes at relevant pages in the reversible file format, the FFTS proceeds to the steps exemplarily illustrated in FIG. 3F.
FIG. 3F exemplarily illustrates the rendering of the hypertext markup language (HTML) content in the reversible file format. The file format transformation system (FFTS) compares 322 the HTML content with a specified page height and introduces page breaks appropriately into the HTML content. The page breaks break the HTML content into individual pages of a predefined length. The FFTS groups 323 the HTML content on each page from header to footer using a <div> element. The FFTS inserts 324 page numbers into the individual pages based on a predefined numbering style, a header and footer, and places a footnote ruler wherever necessary. The FFTS checks 325 whether all the line breaks are processed. If all the line breaks are not processed, the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIGS. 3B-3C. The FFTS then delivers 326 the marked-up content in the reversible file format. The FFTS provides an option to revert 327 the changes made in the reversible file format to the first file format, for example, the HTML file format. If a user wants to revert from the reversible file format to the HTML file format, the FFTS suppresses 328 the changes by hiding the changes in a background and displays the input HTML page having the input HTML content. FIGS. 4A-4B exemplarily illustrate screenshots showing edit views of marked-up content. FIG. 4A exemplarily illustrates a screenshot of an input hypertext markup language (HTML) page containing marked-up content without an edit window 402 in a right pane of a graphical user interface (GUI) 401. FIG. 4B exemplarily illustrates a screenshot of the input HTML page containing marked-up content showing the edit window 402 in the right pane of the GUI 401. The source code of the input HTML page in the edit view is provided below:
<div class="ce_section" id="sec0005" name- OPTJD_294"><div class- ' sectionline_opt" name="PC_5897104232" id="PC_5897104232"><span class="ce_label"
name="OPT_ID_295">K/span><span class="x">&nbsp;</span><span class- 'ce section-title" name- OPT_ID_296">Introducti<span data-request-id=" 1 "
class- 'cursor"></span>on</span></div><div class="ce_para" id="par0020"
name- PT_ID_297">Neurotrophins are a family of growth factors that regulate neuronal survival, growth, and differentiation in the central nervous system <a title- 'bib0005" class="ce_cross-ref* name="OPT_ID_298" id='OPT_ID_298">[l]</a>. Four identified neurotrophic factors including nerve growth factor, brain-derived neurotrophic factor (BDNF), neurotrophin-3, and neurotrophin-4 exert their effects through binding to two different receptors, the tropomyosin-related kinase (Trk) receptor and the p75 neurotrophin receptor. All
proneurotrophins are capable of binding to the p75 receptor; however, three Trk receptors, TrkA, TrkB, and TrkC, bind only to mature NGF, BDNF or NT-4, and NT-3, respectively <a title="bib0010" class="ce_cross-ref name="OPT_ID_299"
id="OPT_ID_299">[2]</a>.</div><div class="ce_para" id="par0025"
name="OPT_ID_300">BDNF is the most abundant neurotrophin in the brain and is essential for synaptic plasticity involved in long-term potentiation (LTP) and learning and memory formation <a title="bib0015" class="ce_cross-ref' name='OPT_ID_301 " id='OPT_ID_301 ">[3]</a>. Hippocampus-specific BDNF gene knockout or knockdown in rodents results in cognitive impairment in behavioral tests <a title="bib0020" class="ce_cross-refs" name="OPT_ID_302" id- OPT ID_302">[4,5]</a>. Moreover, BDNF has neuroprotective effects against diverse neurotoxic insults and neurodegenerative disease models, including Alzheimer<span
class="pc_cpereplace" name="cpe_id_92"><span class- 'cpedel" name="cpe_id_92"><span name="OPT_ID_303 "></span>'</span><span class="cpeins" name="cpe_id_92"><span name="OPT_ID_304"></span><span class="unicode-char">'</span></span></span>s disease (AD) <a title="bib0030" class="ce_cross-refs" name="OPT_ID_305"
id="OPT_ID_305">[6,7]</a>.</div><div class="ce_para" id="par0030"
name="OPT_ID_306">AD is a common neurodegenerative disease characterized by progressive cognitive deficits, and the accumulation of aggregated amyloid-beta (A<span class- 'unicode- char">p</span>) peptide and intracellular neurofibrillary tangles which are composed of hyperphosphorylated tau protein <a title="bib0040" class="ce_cross-ref' name="OPT_ID_307" id="OPT_ID_307">[8]</a>. A<span class="unicode-char"> </span> peptide, a key mediator of AD pathology, is produced after sequential cleavage of the amyloid precursor protein by beta- and gamma-secretases and subsequent aggregation into amyloid fibrils, known to be a major component of senile plaques <a title="bib0045" class="ce_cross-ref ' name="OPT_ID_308" id="OPT_ID_308">[9]</a>. Aggregated A<span class="unicode-char">P</span> peptide including both oligomeric and fibrillar species induces neuronal cell death <span
class- 'ce italic" name="OPT_ID_309">in vitro</span> and <span class="ce_italic"
name="OPT_ID_310">in vivo</span> <a title="bib0050" class="ce_cross-ref
name="OPT_ID_31 1 " id="OPT_ID_31 l ">[10]</a>.</div><div class="ce_para" id="par0035" name="OPT ID 312">Neuronal functions and their involvement in AD have drawn considerable attention to BDNF as a therapeutic target for AD treatment. However, recombinant BDNF itself has poor pharmacokinetic properties<span class="cpeins"
name="cpe_id_94"><span name="OPT_ID_313"></span>,</span> such as a short <span class="ce_italic" name="OPT_ID_314">in vivo</span> half-life, low blood<span
class="pc_cpereplace" name="cpe_id_95"><span class="cpedel" name="cpe_id_95"><span name="OPT_ID_315"></span>-</span><span class- 'cpeins" name="cpe_id_95"><span name- OPT_ID_316"x/spanXspan class="unicode-char">-</span></span></span>brain barrier penetrability, and limited diffusion <a title="bib0055" class="ce_cross-ref
name="OPT_ID_317" id="OPT_ID_317">[l l]</a>. Thus, a variety of strategies for restoring endogenous BDNF levels and functions are currently under development, such as BDNF gene therapy, BDNF-releasing cell grafts, BDNF mimetics, and the use of small molecules that regulate endogenous BDNF levels <a title="bib0060" class="ce_cross-ref '
name="OPT_ID_318" id="OPT_ID_318">[12]</a>. Recently, Nagahara et al. have shown that BDNF gene delivery or direct BDNF infusion restores spatial learning and memory deficits in an AD mouse model and in aged rats, respectively <a title="bib0030" class="ce_cross-ref name="OPT_ID_319" id=OPT_ID_319">[6]</a>.</div><div class="ce_para" id="par0040" name="OPT_ID_320">In a previous study, we screened and identified a BDNF-modulating peptide (Neuropep-1 , Met<span class="pc_cpereplace" name="cpe_id_97"><span
class- 'cpedel" name="cpe_id_97"xspan name="OPT_ID_321 "></span>-Val-</span><span class="cpeins" name- 'cpe_id_97"xspan name="OPT_ID_322"x/span><span class- 'unicode- char">-</span>Val<span class="unicode-char">-</span></span></span>Gly) by a positional scanning<span class- 'pc cpereplace" name="cpe_id_99"><span class- 'cpedel"
name="cpe_id_99"><span name="OPT_ID_323"></span>-</span><span class="cpeins" name:="cpe_id_99"><span name="OPT_ID_324"x/span><span class="unicode-char">- </spanx/span></span>synthetic peptide combinatorial library (PS<span class="pc_cpereplace" name="cpe_id_101 "><span class- 'cpedel" name="cpe_id_101 "><span
name="OPT_ID_325"x/span>-</span><span class="cpeins" name="cpe_id_101 "xspan name="OPT_ID_326"x/span><span class="unicode-char">-</span></span></span>SPCL). This novel peptide was found to protect neurons against A<span class- 'unicode- char">p</span>-induced neuronal cell death and improves spatial learning and memory performance in na<span class="unicode-char">'f</span>ve rats and a triple-transgenic AD mouse model <a title="bib0065" class="ce_cross-refs" name="OPT_ID_327"
id="OPT_ID_327">[13,14]</a>. In this study, we modified and synthesized novel peptides based on our previous PS<span class="pc_cpereplace" name="cpe_id_103"><span class- 'cpedel" name="cpe_id_l 03"><span name- PT JD_328"></span>-</span><span class="cpeins" name- 'cpe_id_l 03"><span name="OPT_ID_329"></span><span
class- 'unicode-char">-</span></span></span>SPCL data to identify a BDNF-modulating peptide more potent than Neuropep-1, and examined its protective effects against A<span class="unicode-char">p</span>-induced neuronal cell death. Among the identified BDNF- modulating peptides, Neuropep-4, which has aspartic acid substituted for valine at the second position of Neuropep-1 , was found to be highly effective in inducing BDNF expression even at 100-fold lower concentrations in the SH<span class- 'pc_cpereplace"
name="cpe_id_105"><span class- 'cpedel" name="cpe_id_105"><span
name="OPT_ID_330"></span>-</span><span class- 'cpeins" name="cpe_id_105"><span name="OPT_ID_331 "></span><span class="unicode-char">-</span></span></span>S Y5 Y cell line. In addition, Neuropep-4 regulated BDNF expression in rat primary cortical neurons and provided neurons with the strongest protection against oligomeric and fibrillar A<span class="unicode-char">p</span><span class="ce_inf name="OPT_ID_332">l -42</span>- induced cell death through BDNF upregulation compared to other peptides. These findings suggest that this novel peptide, Neuropep-4, has therapeutic potential for the treatment of AD.</div></div>
FIG. 4C exemplarily illustrates a screenshot of the input hypertext markup language (HTML) page containing marked-up content in a proof view. The source code of the input HTML page in the proof view is provided below:
<div class- ' wrapper-page clearfix"><div class- 'page clearfix" style- 'height: 990px; position: relative;"><div class="page-header"><div class- 'left-header journal-logo"><img src="http://s3.amazonaws.com/pgc-dev- test/cover_images/elsevier/HLY/HLY_Thumbnail.png"x/div><div class="right-header"><span class="article-no">Article&nbsp;No&nbsp;~&nbsp;</span></div></div><div class- 'content top-spaee-none"><div class- 'head top-space-none" id="headl " name="OPT_ID_161 "><div class="ce_abstract top-space-none" id="abs0010" name="OPT_ID_256"><div
class="ce_abstract-sec top-space-none" id="abst0010" name="OPT_ID_258"><span
class="ce_simple-para top-space-none" id="spar0025" name="OPT_ID_259">va-line<span class="ws"> </span>in<span class="ws"> </span>the<span class="ws"> </span>sec-ond<span class="ws"> </span>po-si-tion<span class="ws"> </span>with<span class="ws">
</span>as-par-tic<span class="ws"> </span>acid,<span class="ws"> </span>the<span class- 'ws"> </span>re-sult-ing<span class- 'ws"> </span>Neu-ropep-4<span class="ws"> </span>was<span class="wsbr"> </span>found<span class="ws"> </span>to<span class="ws"> </span>be<span class="ws"> </span>highly<span class="ws"> </span>ef-fec-tive<span class="ws"> </span>in<span class- 'ws"> </span>in-duc-ing<span class="ws">
</span>BDNF<span class- 'ws"> </span>ex-pres-sion<span class="ws"> </span>even<span class="ws"> </span>at<span class="ws"> </span>con-cen-tra-tions<span class="wsbr"> </span>of<span class- ' ws"> </span>l<span class="ce_hsp"
name="OPT_ID_275">&nbsp;</span>pM<span class="ws"> </span>in<span class="ws"> </span>the<span class="ws"> </span>SH<span class="pc_cpereplace"
name="cpe_id_89"><span class="cpedel hideme" name="cpe_id_89"><span
name- OPT_ID_276"></span>-</span><span class- 'cpeins" name="cpe_id_89"><span name="OPT_ID_277"x/span><span class="unicode-char">- </span></span></span>SY5Y<span class="ws"> </span>cell<span class="ws">
</span>line<span class="ws"> </span>and<span class="ws"> </span>rat<span class="wsM> </span>pri-mary<span class="ws"> </span>cor-ti-cal<span class="ws"> </span>neu-rons.<span class="ws"> </span>In<span class="ws"> </span>ad-di-tion,<span class="wsbr">
</span>among<span class="ws"> </span>the<span class="ws"> </span>tested<span
class="ws"> </span>pep-tides,<span class="ws"> </span>Neu-ropep-4<span class="ws"> </span>pro-vided<span class="ws"> </span>neu-rons<span class="ws"> </span>with<span class="ws"> </span>the<span class="ws"> </span>strongest<span class="wshbr">
</span>pro-tec-tion<span class="ws"> </span>against<span class="ws">
</span>oligomeric<span class="ws"> </span>and/or<span class="ws"> </span>fib-ril-lar<span class="ws"> </span>A<span class="unicode-char">P</span><span class="ce_inf '
name=OPT_ID_278">l-42</span>-in-duced<span class="ws"> </span>celKspan class="ws"> </span>death<span class="ws"> </span>through<span class="ws"> </span>BDNF<span class="wsbr"> </span>up-reg-u-la-tion.<span class- 'ws"> </span>These<span class="ws"> </span>re-sults<span class="ws"> </span>sug-gest<span class="ws"> </span>the<span class="ws"> </span>po-ten-tial<span class="ws"> </span>of<span class="ws">
</span>Neu-ropep-4<span class="ws"> </span>as<span class="ws"> </span>a<span
class="ws"> </span>ther-a-peu-tic<span class="wsbr"> </span>can-di-date<span class="ws"> </span>for<span class="ws"> </span>treat-ing<span class="ws">
</span>neu-rode-gen-er-a-tive<span class="ws"> </span>dis-eases<span class="cpeins" name="cpe_id_91 "><span name="OPT_ID_279"></span>,</span><span class="ws">
</span>such<span class="ws"> </span>as<span class="ws last-word"> </span>AD.</span><div class="wsp"x/div></div></div><div class="ce_keywords" name="OPT_ID_280" id="OPT_ID_280"><span class="ce_section-title" name="OPT_ID_28 l "> eywords<span class- 'x">: </span></span><span class="ce_keyword" name="OPT_ID_282"><span class="ce_text" name="OPT_ID_283 ">Brain-de-rived<span class="ws">
</span>neu-rotrophic<span class="ws"> </span>fac-tor</span><span class="x">;
</span></span><span class="ce_keyword" name- 'OPT_ID_284"><span class="ce_text" name="OPT_ID_285">Alzheimer<span class="unicode-char">'</span>s<span class="ws last- word'^ </span>disease</span><span class="x">; </span></span><span class- 'ce keyword" name=OPT_ID_286"><span class="ce_text" name="OPT_ID_287">Amy-loid- beta</span><span class="x">; </span></span><span class="ce_keyword"
name="OPT_ID_288"><span class="ce_text"
name="OPT_ID_289">Neu-ro-pro-tec-tion</span><span class="x">; </span></span><span class="ce_keyword" name="OPT_ID_290"><span class="ce_text"
name="OPT_ID_291 ">Pep-tide</span></span><div class="wsp"></div></div></div><div class="ce_sections" name='OPT_ID_293" id="OPT_ID_293"><div class="ce_section" id="sec0005" name- OPT_ID_294"><div class="sectionline_opt first_level_heading" name="PC_5897104232" id="PC_5897104232"><span class="ce_label"
name="OPT_ID_295 "> 1 </span><span class="x">&nbsp;</span><span class="ce_section-title" name=OPT_ID_296">Introduction</span></div><div class="ce_para" id="par0020" name="OPT_ID_297">Neu-rotrophins<span class="ws"> </span>are<span class="ws">
</span>a<span class="ws"> </span>fam-ily<span class="ws"> </span>of<span class="ws"> </span>growth<span class="ws"> </span>fac-tors<span class="ws"> </span>that<span class="ws"> </span>reg-u-late<span class="ws"> </span>neu-ronal<span class="ws">
</span>sur-vival,<span class="wsbr"> </span>growth,<span class="ws"> </span>and<span class="ws"> </span>dif-fer-en-ti-a-tion<span class="ws"> </span>in<span class="ws">
</span>the<span class="ws"> </span>cen-tral<span class="ws"> </span>ner-vous<span class="ws"> </span>sys-tem<span class="ws"> </span><a title="bib0005" class="ce_cross-ref ' name="OPT_ID_298" id="OPT_ID_298">[l]</a>.<span class="ws"> </span>Four<span class="ws"> </span>iden-ti-fied<span class="wshbr"> </span>neu-rotrophic<span class="ws"> </span>fac-tors<span class="ws"> </span>in-clud-ing<span class="ws"> </span>nerve<span class="ws"> </span>growth<span class="ws"> </span>fac-tor,<span class="ws"> </span>brain- de-rived<span class="ws"> </span>neu-rotrophic<span class="ws"> </span>fac-tor<span class="wsbr"> </span>(BDNF),<span class="ws"> </span>neu-rotrophin-3,<span class="ws"> </span>and<span class="ws"> </span>neu-rotrophin-4<span class="ws"> </span>ex-ert<span class="ws"> </span>their<span class="ws"> </span>ef-fects<span class- 'ws">
</span>through<span class="ws"> </span>bind-ing<span class="ws"> </span>to<span class="wsbr"> </span>two<span class="ws"> </span>dif-fer-ent<span class="ws">
</span>re-cep-tors,<span class- 'ws"> </span>the<span class="ws"> </span>tropomyosin- re-lated<span class="ws"> </span>ki-nase<span class="ws"> </span>(Trk)<span class="ws"> </span>re-cep-tor<span class="ws"> </span>and<span class="ws"> </span>the<span class="ws"> </span>p75<span class="wsbr"> </span>neu-rotrophin<span class="ws">
</span>re-cep-tor.<span class="ws"> </span>All<span class="ws">
</span>proneu-rotrophins<span class="ws"> </span>are<span class="ws">
</span>ca-pa-ble<span class="ws"> </span>of<span class="ws"> </span>bind-ing<span class="ws"> </span>to<span class="ws"> </span>the<span class="ws"> </span>p75<span class- 'wshbr"> </span>re-cep-tor;<span class="ws"> </span>how-ever,<span class="ws"> </span>three<span class="ws"> </span>Trk<span class="ws"> </span>re-cep-tors,<span class="ws"> </span>TrkA,<span class="ws"> </span>TrkB,<span class="ws">
</span>and<span class="ws"> </span>TrkC,<span class="ws"> </span>bind<span class="ws"> </span>only<span class="ws"> </span>to<span class="ws"> </span>ma-ture<span
class="wsbr"> </span>NGF,<span class="ws"> </span>BDNF<span class="ws">
</span>or<span class="ws"> </span>NT-4,<span class="ws"> </span>and<span class="ws"> </span>NT-3,<span class- 'ws"> </span>re-spec-tively<span class="ws last-word"> </span><a title="bib0010" class="ce_cross-ref name="OPT_ID_299" id="OPT_ID_299">[2]</a>.<div class="wsp"x/div></div><div class="ce_para" id="par0025"
name="OPT_ID_300">BDNF<span class="ws"> </span>is<span class="ws"> </span>the<span class="ws"> </span>most<span class="ws"> </span>abun-dant<span class="ws">
</span>neu-rotrophin<span class- 'ws"> </span>in<span class="ws"> </span>the<span class="ws"> </span>brain<span class="ws"> </span>and<span class- 'ws"> </span>is<span class="ws"> </span>es-sen-tial<span class="ws"> </span>for<span class="ws">
</span>synap-tic<span class- 'wsbr"> </span>plas-tic-ity<span class="ws">
</span>in-volved<span class="ws"> </span>in<span class="ws"> </span>long-term<span class="ws"> </span>po-ten-ti-a-tion<span class="ws"> </span>(LTP)<span class="ws"> </span>and<span class- 'ws"> </span>learn-ing<span class="ws"> </span>and<span class="ws"> </span>mem-ory<span class="wshbr"> </span>for-ma-tion<span class="ws"> </span><a title="bib0015" class="ce_cross-ref ' name="OPT_ID_301 "
id="OPT_ID_30r'>[3]</a>.<span class="ws"> </span>Hip-pocam-pus-spe-cific<span class="ws"> </span>BDNF<span class="ws"> </span>gene<span class="ws"> </span>knock-out<span class="ws"> </span>or<span class="ws"> </span>knock-down<span class="ws"> </span>in<span class="ws"> </span>ro-dents<span class-"wsbr">
</span>re-sults<span class="ws"> </span>in<span class="ws"> </span>cog-ni-tive<span class="ws"> </span>im-pair-ment<span class="ws"> </span>in<span class="ws">
</span>be-hav-ioral<span class="ws"> </span>tests<span class="ws"> </span><a
title="bib0020" class="ce_cross-refs" name="OPT_ID_302"
id="OPT_ID_302">[4,5]</a>.<span class="ws"> </span>More-over,<span class="ws">
</span>BDNF<span class="ws"> </span>has<span class="wsbr">
</span>neu-ro-pro-tec-tive<span class="ws"> </span>ef-fects<span class="ws">
</span>against<span class="ws"> </span>di-verse<span class="ws"> </span>neu-ro-toxic<span class="ws"> </span>in-sults<span class="ws"> </span>and<span class="ws">
</span>neu-rode-gen-er-a-tive<span class="wsbr"> </span>dis-ease<span class="ws">
</span>mod-els,<span class="ws"> </span>in-clud-ing<span class="ws">
</span>Alzheimer<span class="pc_cpereplace" name- 'cpe_id_92"><span class="cpedel hideme" name="cpe_id_92"><span name- 'OPT ID 303 "></span>'</span><span
class- 'cpeins" name="cpe_id_92"><span name="OPT_ID_304"></span><span class="unicode- char">'</span></span></span>s<span class="ws"> </span>dis-ease<span class="ws">
</span>(AD)<span class="ws last-word"> </span><a title="bib0030" class="ce_cross-refs" name="OPT_ID_305" id="OPT_ID_305">[6,7]</a>.<div class-"wsp"></div></div><div class="ce_para" id="par0030" name="OPTJD_306">AD<span class="ws"> </span>is<span class="ws"> </span>a<span class="ws"> </span>com-mon<span class="ws">
</span>neu-rode-gen-er-a-tive<span class="ws"> </span>dis-ease<span class="ws">
</span>char-ac-ter-ized<span class="ws"> </span>by<span class="ws">
</span>pro-gres-sive<span class="wshbr"> </span>cog-ni-tive<span class="ws">
</span>deficits,<span class="ws"> </span>and<span class- 'ws"> </span>the<span
class="ws"> </span>ac-cu-n u-la-tion<span class="ws"> </span>of<span class="ws">
</span>ag-gre-gated<span class="ws"> </span>amy-loid-beta<span class=" ws">
</span>(A<span class="unicode-char">P</span>)<span class="ws"> </span>pep-tide<span class="ws"> </span>and<span class=:"wsbr"> </span>in-tra-cel-lu-lar<span class="ws">
</span>neu-rofib-ril-lary<span class="ws"> </span>tan-gles<span class="ws">
</span>which<span class="ws"> </span>are<span class^'^vs'^ </span>com-posed<span class="ws"> </span>of<span class="ws"> </span>hy-per-phos-pho-ry-lated<span
class="wsbr"> </span>tau<span class="ws"> </span>pro-tein<span class="ws"> </span><a title="bib0040" class="ce_cross-ref ' name="OPT_ID_307" id="OPTJD_307M>[8]</a>.<span class="ws"> </span>A<span class="unicode-char">p</span><span class="ws"> </span>pep-tide,<span class="ws"> </span>a<span class="ws"> </span>key<span class="ws"> </span>me-di-a-tor<span class="ws"> </span>of<span class="ws"> </span>AD<span class="ws"> </span>pathol-ogy,<span class="ws">. </span>is<span class="ws">
</span>pro-duced<span class="ws"> </span>af-ter<span class="wshbr">
</span>se-quen-tial<span class="ws"> </span>cleav-age<span class="ws"> </span>of<span class="ws"> </span>the<span class="ws"> </span>amy-loid<span class="ws">
</span>pre-cur-sor<span class="ws"> </span>pro-tein<span class="ws"> </span>by<span class="ws"> </span>beta-<span class="ws"> </span>and<span class="ws"> </span>gamma- sec-re-tases<span class="wsbr"> </span>and<span class="ws"> </span>sub-se-quent<span class="ws"> </span>ag-gre-ga-tion<span class="ws"> </span>into<span class="ws">
</span>amy-loid<span class="ws"> </span>fib-rils,<span class="ws"> </span>known<span class="ws"> </span>to<span class="ws"> </span>be<span class="ws"> </span>a<span class="ws"> </span>ma-jor<span class="ws"> </span>com-po-nent<span class="wsbr"> </span>of<span class- 'ws"> </span>se-nile<span class- ' ws"> </span>plaques<span class="ws"> </span><a title="bib0045" class="ce_cross-ref * name="OPT_ID_308"
id="OPT_ID_308">[9]</a>.<span class="ws"> </span>Ag-gre-gated<span class="ws"> </span>A<span class="unicode-char"> </span><span class="ws"> </span>pep-tide<span class="ws"> </span>in-clud-ing<span class="ws"> </span>both<span class="ws">
</span>oligomeric<span class="ws"> </span>and<span class="wshbr"> </span>fib-ril-lar<span class="ws"> </span>species<span class="ws"> </span>in-duces<span class="ws">
</span>neu-ronal<span class="ws"> </span>cell<span class="ws"> </span>death<span class="ws"> </span><span class="ce_italic" name="OPT_ID_309">in<span class="ws"> </span>vitro</span><span class="ws"> </span>and<span class="ws"> </span><span class="ce_italic" name="OPT_ID_310">in<span class="ws"> </span>vivo</span><span class="ws last-word"> </span><a title="bib0050" class="ce_cross-ref name="OPT_ID_31 1 " id="OPT_ID_31 l ">[10]</a>.<div class="wsp"></div></div><div class="ce_para"
id="par0035" name- OPT_ID_312M>Neu-ronal<span class="ws"> </span>func-tions<span class="ws"> </span>and<span class="ws"> </span>their<span class="ws">
</span>in-volve-ment<span class="ws"> </span>in<span class="ws"> </span>AD<span class="ws"> </span>have<span class="ws"> </span>drawn<span class- 'ws">
</span>con-sid-er-able<span class="wshbr"> </span>at-ten-tion<span class="ws">
</span>to<span class="ws"> </span>BDNF<span class="ws"> </span>as<span class="ws"> </span>a<span class="ws"> </span>ther-a-peu-tic<span class- 'ws"> </span>tar-get<span class="ws"> </span>for<span class="ws"> </span>AD<span class="ws">
</span>treat-ment.<span class="ws"> </span>How-ever,<span class="ws">
</span>re-com-bi-nant<span class="wsbr"> </span>BDNF<span class="ws">
</span>it-self<span class="ws"> </span>has<span class="ws"> </span>poor<span class="ws"> </span>phar-ma-co-ki-netic<span class="ws"> </span>prop-er-ties<span class="cpeins" name- 'cpe_id_94"xspan name=OPT_ID_313 "></span>,</span><span class="ws">
</span>such<span class- 'ws"> </span>as<span class- 'ws"> </span>a<span class="ws"> </span>short<span class="ws"> </span><span class="ce italic" name="OPT_ID_314">in<span class="ws"> </span>vivo</span><span class="ws"> </span>half-life,<span class="wsbr"> </span>low<span class="ws"> </span>blood<span class="pc_cpereplace"
name="cpe_id_95"><span class- 'cpedel hideme" name="cpe_id_95"><span
name="OPT_ID_315"></span>-</span><span class="cpeins" name="cpe_id_95"><span name- ΡΤ_ΓΕ)_316"x/spanxspan class="unicode-char">- </span></span></span>brain<span class="ws"> </span>bar-rier<span class="ws">
</span>pen-e-tra-bil-ity,<span class="ws"> </span>and<span class="ws">
</span>lim-ited<span class- 'ws"> </span>dif-fu-sion<span class="ws"> </span><a
title="bib0055" class=.Mce_cross-ref' name="OPT_ID_317" id="OPT_ID_317">[l l]</a>.<span class="ws"> </span>Thus,<span class="ws"> </span>a<span class="ws">
</span>va-ri-ety<span class="ws"> </span>of<span
class="fillerText"x/span></div></div></div></div><div class="footer- wrapper bottom-footer- wrapper"><div class="footnote-wrapper"x/div><div class="top-ruler"x/div><div
class="pane-content"><div class- 'left-pane-content page-number">2</div><div class- 'center- pane-content"xspan class="doi">http://dx.doi.org/l 0.1016/j .neulet.2013.11.020</span><span class="copyright">1060-3743/@2013 The Authors. Published by Elsevier
Limited.</span><span class="cc-license">This is an open access article under the CC BY--NC— ND license (http://creativecornrnons.org/licenses/by-nc-nd/4.0/).</span></div><div class- 'right- pane-content"x/divx/divx/div></divx/div>
The file format transformation system (FFTS) transforms the input hypertext markup language (HTML) page exemplarily illustrated in FIGS. 4A-4B, to an output page in the reversible file format, that is, the PH5 format as exemplarily illustrated in FIG. 4C. In the output page in the PH5 format, the FFTS replaces each of the word spaces identified in the marked-up content of the input hypertext markup language (HTML) page exemplarily illustrated in FIG. 4A, with a tag <span class="ws">, and tags each of the line breaks with a tag <span class="wsbr">. The FFTS further hyphenates words where appropriate. The FFTS also introduces a tag <span class="fillerText"> to fill in orphan and widow sections with filler text. The FFTS retains the original HTML tags and appends the PH5 format tags to the marked-up content. FIG. 4D exemplarily illustrates a screenshot showing a source code of the marked-up content rendered in the reversible file format, that is, the PH5 format.
FIG. 5 exemplarily illustrates a system 500 comprising a file format transformation system (FFTS) 502 deployed on a client device 501 for transforming marked-up content in a first file format to a reversible second file format. The client device 501 can be, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, a portable computing device, a laptop, a personal digital assistant, a touch centric device, a workstation, a portable electronic device, a network enabled computing device, an interactive network enabled communication device, any other suitable computing equipment, combinations of multiple pieces of computing equipment, etc. In an embodiment, the FFTS 502 is
implemented as a standalone software application on the client device 501.
The system 500 disclosed herein comprises a non-transitory computer readable storage medium such as a memory unit, and at least one processor communicatively coupled to the non- transitory computer readable storage medium on the client device 501. As used herein, "non- transitory computer readable storage medium" refers to all computer readable media, for example, non-volatile media such as optical discs or magnetic disks, volatile media such as a register memory, a processor cache, etc., and transmission media such as wires that constitute a system bus coupled to the processor, except for a transitory, propagating signal. The non- transitory computer readable storage medium stores computer program instructions defined by modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502. The processor is configured to execute the defined computer program instructions.
The file format transformation system (FFTS) 502 further comprises a content reception module 502a, a content reflow module 502b, a space and block identification module 502c, a tagging module 502d, a pagination element processing module 502e, and a compiler 502f. The content reception module 502a receives the marked-up content of the first file format, for example, the hypertext markup language (HTML) format. An example of a pseudocode of the content reception module 502a executed to receive the marked-up content ¾f the first file format is provided below: function receiveContent(self, container, source) {
var innerContainer = null, paginator = null; var content = null; generateContentContainer(self, container) ; paginator = domHelper.create('div');
paginator.classList.add('paginator');
domHelper.append(paginator, container); content = source;
content = insertSoftHyphensForAllWords(content); innerContainer = self.domHelper.find(container, '.paginator');
innerContainer. innerHTML = source;
}
The content reflow module 502b reflows the received marked-up content of the first file format into a continuous page having a configurable page width. An example of a pseudocode of the content reflow module 502b executed to reflow the received marked-up hypertext markup language (HTML) content is provided below: var options = {
- "options": {
"page": {
"height": "262",
"width": " 192",
"unit": "mm"
}
}
} function reflowContent(options) { var width = [
"width", self, options .page . width,
self.options.page.unit
].joinO; self.domHelper.addOrModifyAttribute('style', width, target);
}
The space and block identification module 502c identifies spaces and block elements in the reflown marked-up content of the first file format. An example of a pseudocode of the space and block identification module 502c executed to identify and tag spaces and block elements in the reflown marked-up hypertext markup language (HTML) content is provided below: function putSpanForWordSpace(self, content) {
var ws = self.ws; content. find('*: visible'). contents().filter( function () {
var value = "";
if (this.nodeType === 3) {
value = this.nodeValue;
if (value.indexOf(" ") != -l) {
return true;
}
}
return false;
})
.replace With(function () {
var str = "", spaces = [], replacedStr = "", dummy = null,
finalstr = "";
str =jQ(this).text();
dummy =jQ('<div></div>');
finalstr = dummy .text(str).html(); spaces = finalstr.split(' ');
replacedStr = spaces.join("<span data-ph5='ws'> </span>");
return replacedStr;
});
} function identifyBlockElements(content) {
visibleDivs = content.find('div:visible');
length = visibleDivs. length;
for (; i < length; i += 1 ) {
visibleDiv = jQ(visibleDivs[i]);
if (visibleDiv.css('display') !== "inline") {
visibleDiv.append("<div data-ph5='wsp'></div>");
}
}
}
The tagging module 502d generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format. For each of the identified spaces and the identified block elements, the pagination element processing module 502e determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags. The tagging module 502d tags the determined line breaks. An example of the pseudocode of the pagination element processing module 502 e executed to determine the line breaks is provided below: function determineLineBreaks() {
paginationElements = content.find("span.ws,span.shy,div.wsp");
length = paginationElements. length;
for (; i < length; i += 1) {
linebreak = false; curElement = jQ(paginationElements[i]);
if ((curElement.class('ws') == true) && (curElement. width() == 0)) {
linebreak = true; true) && (curElement.widthQ
true) {
if (linebreak == true) {
introduceLineBreak();
}
}
For each of the determined line breaks, the pagination element processing module 502e identifies anchored floats in the reflown marked-up content of the first file format. The tagging module 502d tags the identified anchored floats. Further, for each of the determined line breaks, the pagination element processing module 502e positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page. The pagination element processing module 502e positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page. An example of a pseudocode of the pagination element processing module 502e executed to position anchored floats in the output hypertext markup language (HTML) document is provided below: if (lbr.hasClass('float-anchor') === true) { // if a line has float anchor
floatHeight = getFloatHeight(floatltem);
if (currentFilledHeight + floatHeight < pageHeight) {
pushFloatToCurrentPage(floatltem);
currentFilledHeight = currentFilledHeight + floatHeight;
} else {
pushFloatToNextAvailablePage(floatltem);
} Further, for each of the determined line breaks, the pagination element processing module 502e identifies footnotes in the reflown marked-up content of the first file format. The tagging module 502d tags the identified footnotes. Further, for each of the determined line breaks, the pagination element processing module 502e positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page. The pagination element processing module 502e positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page. An example of a pseudocode of the pagination element processing module 502e executed to position footnotes in the output hypertext markup language (HTML) document is provided below: if (lbr.hasClass('footnote') === true) { // if a line has footnote
footnoteHeight = getFootnoteHeight(footnoteltem);
if (currentFilledHeight + footnoteHeight < pageHeight) {
pushFootnoteToCurrentPage(footnoteltem);
currentFilledHeight = currentFilledHeight + footnoteHeight;
} else {
pushCurrentLineAndRelatedFootnotesToNextPage();
}
}
Further, the pagination element processing module 502e positions page breaks in the continuous page based on a configurable page height and the determined line breaks for the positioning of the tagged anchored floats and the tagged footnotes on a subsequent page on nonavailability of space on the current page. An example of a pseudocode of the pagination element processing module 502e executed to create pages in the output hypertext markup language (HTML) document is provided below: var wordSpaces = $(document.body).find('span.ws,div.wsp');
for(var i=0; i < wordSpaces.length; i++) {
var ws = wordSpaces.eq(i);
if(ws.width() == 0 || ws.attr('class') == 'wsp' ) { // its a line break
var y - ws.offset().top; if(y-ydef > px) {
pageSize.push(y-ydef);
ydef = y;
pageBreak = ws.attr('classVwspbr');
}
}
}
The compiler 502f groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page. The pagination element processing module 502e inserts one or more pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content. The compiler 502f renders the grouped marked-up content with the inserted pagination elements in the reversible second file format. An example of the pseudocode of the compiler 502f executed for performing the steps of grouping and insertion of page numbers is provided below: function makePageBlocks() {
var pageBreaks = content.find(".wspr");
var startPage = content.top();
for(var i=0; i < pageBreaks. length; i++) {
endPage = pageBreaks [i];
wrapPageWithNumber("<div class- page" + i + '>", i, startPage, endPage);
startPage = endPage;
}
}
J he pagination element processing-module-502e handles-grouped-elements comprising^ for example, a float and a caption associated with the float in the reversible second file format at a position assigned in the marked-up content of the first file format to the float. If a user wants to revert back to the input marked-up content page, the compiler 502f reverses the marked-up content in the reversible second file format to the first file format to restore the continuous page. An example of the pseudocode of the compiler 502f executed for reversing the PH5 mark-up to the original input (HTML) mark-up is provided below: function removePaginationArtifacts() {
var headerFooter = content.find(".page-header-footer");
headerFooter.remove();
var footnotes = content. find(". footnote");
footnotes.moveToEndOfDocumentO;
var floats = content. find(". floats");
floats.moveAfterCitationPara();
var paginationElements = content.find(".ws,.shy,.wsp");
paginationElements.removeTagsWithContent();
removeSoftHyphensAndPseudoBreaks();
}
FIG. 6 exemplarily illustrates the hardware architecture 600 of a client device 501 that deploys the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, for transforming marked-up content in a first file format to a reversible second file format. The
FFTS 502 is deployed on a computer system of the client device 501 and is programmable using a high level computer programming language. The FFTS 502 may be implemented using programmed and purposeful hardware. As exemplarily illustrated in FIG. 6, the hardware architecture 600 of the client device 501 comprises a processor 601, a non-transitory computer readable storage medium such as a memory unit 602 for storing computer programs and data, an input/output (I/O) controller 603, a network interface 604, a data bus 605, a display unit 606, input devices 607, a fixed media drive 608 such as a hard drive, a removable media drive 609 for receiving removable media, output devices 610, etc. The processor 601 refers to any one or more microprocessors, central processing unit (CPU) devices, finite state machines, computers, microcontrollers, digital signal processors, logic, a logic device, an electronic circuit, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a chip, etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions. The processor 601 may also be implemented as a processor set comprising, for example, a programmed microprocessor and a math or graphics co- processor. The processor 601 is selected, for example, from the Intel® processors such as the Itanium® microprocessor or the Pentium® processors, Advanced Micro Devices (AMD®) processors such as the Athlon® processor, UltraSPARC® processors, microSPARC® processors, hp® processors, International Business Machines (IBM®) processors such as the PowerPC® microprocessor, the MIPS® reduced instruction set computer (RISC) processor of MIPS Technologies, Inc., RISC based computer processors of ARM Holdings, Motorola® processors, Qualcomm® processors, etc. The FFTS 502 disclosed herein is not limited to employing a processor 601. The FFTS 502 may also employ a controller or a microcontroller. The processor 601 executes the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 exemplarily illustrated in FIG. 5.
The memory unit 602 is used for storing computer programs, applications, and data. For example, the content reception module 502a, the content reflow module 502b, the space and block identification module 502c, the tagging module 502d, the pagination element processing module 502e, the compiler 502f, etc., exemplarily illustrated in FIG. 5, are stored in the memory unit 602 of the client device 501. The memory unit 602 is, for example, a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by the processor 601. The memory unit 602 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 601. The client device 501 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 601. The I/O controller 603 controls input actions and output actions performed by the FFTS 502.
The network interface 604 enables connection of the client device 501 to a network, for example, a short range network or a long range network. The network is, for example, the internet. In an embodiment, the network interface 604 is provided as an interface card also referred to as a line card. The network interface 604 comprises, for example, one or more of an infrared (IR) interface, an interface implementing Wi-Fi® of Wi-Fi Alliance Corporation, a universal serial bus (USB) interface, a Fire Wire® interface of Apple Inc., an Ethernet interface, a frame relay interface, a cable interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide~area"network (WAN")"interface interfaces-using serial-protocols, interfaces— using parallel protocols, Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, a high speed serial interface (HSSI), a fiber distributed data interface (FDDI), interfaces based on transmission control protocol (TCP)/internet protocol (IP), interfaces based on wireless communications technology such as satellite technology, radio frequency (RF) technology, near field communication, etc. The data bus 605 permits communications between the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502
exemplarily illustrated in FIG. 5. The display unit 606, via the graphical user interface (GUI) 401 exemplarily illustrated in FIGS. 4A-4C, displays information such as the marked-up content, display interfaces, user interface elements such as text fields, etc., for allowing a user of the file format transformation system (FFTS) 502 to view an input page in a first file format and a transformed output page in the reversible second file format. The display unit 606 comprises, for example, a liquid crystal display, a plasma display, an organic light emitting diode (OLED) based display, etc. The input devices 607 are used for inputting data into the client device 501. The users of the client device 501 use the input devices 607 to provide inputs to the FFTS 502. For example, a user may enter a file format or edit an input page on the GUI 401 using the input devices 607. The input devices 607 are, for example, a keyboard such as an alphanumeric keyboard, a microphone, a joystick, a pointing device such as a computer mouse, a touch pad, a light pen, a physical button, a touch sensitive display device, a track ball, a pointing stick, any device capable of sensing a tactile input, etc.
Computer applications and computer programs are used for operating the file format transformation system (FFTS) 502. The computer programs are loaded onto the fixed media drive 608 and into the memory unit 602 of the client device 501 via the removable media drive 609. In an embodiment, the computer applications and computer programs may be loaded directly via the network. Computer applications and computer programs are executed by double clicking a related icon displayed on the display unit 606 using one of the input devices 607. The output devices 610, for example, a printer outputs the results of operations performed by the FFTS 502. For example, the FFTS 502 renders the transformed output page in the reversible second file format using the output devices 610.
The processor 601 executes an operating system, for example, the Linux® operatingsystem, the-Unix®_operating s-ystem,-any version of the Microsoft* Windows* operating-systenv the Mac OS of Apple Inc., the IBM® OS/2, Vx Works® of Wind River Systems, Inc., QNX Neutrino® developed by QNX Software Systems Ltd., Palm OS®, the Solaris operating system developed by Sun Microsystems, Inc., the Android operating system, the Windows Phone® operating system of Microsoft Corporation, the BlackBerry® operating system of BlackBerry Limited, the iOS operating system of Apple Inc., the Symbian™ operating system of Symbian Foundation Limited, etc. The file format transformation system (FFTS) 502 employs the operating system for performing multiple tasks. The operating system is responsible for management and coordination of activities and sharing of resources of the client device 501. The operating system further manages security of the FFTS 502, peripheral devices connected to the client device 501, and network connections. The operating system employed on the client device 501 recognizes, for example, inputs provided by the users using'one of the input devices 607, the output display, files, and directories stored locally on the fixed media drive 608. The operating system on the client device 501 executes different computer programs using the processor 601. The processor 601 and the operating system together define a computer system for which application programs in high level programming languages are written. The processor 601 of the client device 501 retrieves instructions defined by the content reception module 502a, the content reflow module 502b, the space and block identification module 502c, the tagging module 502d, the pagination element processing module 502e, the compiler 502f, etc., for performing respective functions disclosed in the detailed description of FIG. 5. The processor 601 retrieves instructions for executing the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 from the memory unit 602. A program . counter determines the location of the instructions in the memory unit 602. The program counter stores a.number that identifies the current position in the computer program of each of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502. The instructions fetched by the processor 601 from the memory unit 602 after being processed are decoded. The instructions are stored in an instruction register in the processor 601. After processing and decoding, the processor 601 executes the instructions, thereby performing one or more processes defined by those instructions.
At the time of execution, the instructions stored in the instruction register are examined to determine the operations to be performed. The processor 601 then performs the specified operations. The operations comprise arithmetic operations and logic operations. The operating -system performs multiple routines for performing a number of tasks required to assign the input devices 607, the output devices 610, and memory for execution of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the file format transformation system (FFTS) 502. The tasks performed by the operating system comprise, for example, assigning memory to the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502, and to data used by the FFTS 502, moving data between the memory unit 602 and disk units, and handling input/output operations. The operating system performs the tasks on request by the operations and after performing the tasks, the operating system transfers the execution control back to the processor 601. The processor 601 continues the execution to obtain one or more outputs. The outputs of the execution of the modules, for example, 502a, 502b, 502c, 502d, 502e, 502f, etc., of the FFTS 502 are displayed to the user on the display unit 606. Disclosed herein is also a computer program product comprising a non-transitory computer readable storage medium having embodied thereon, computer program codes comprising instructions executable by at least one processor 601 for transforming marked-up content in a first file format to a reversible second file format. The computer program product comprises a first computer program code for receiving the marked-up content of the first file format; a second computer program code for reflowing the received marked-up content of the first file format into a continuous page having a configurable page width; a third computer program code for identifying spaces and block elements in the reflown marked-up content of the first file format; a fourth computer program code for generating and appending tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format; a fifth computer program code for determining line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags; a sixth computer program code for tagging the determined line breaks; a seventh computer program code for identifying anchored floats in the reflown marked-up content of the first file format; an eight computer program code for tagging the identified anchored floats; a ninth computer program code for positioning the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page; a tenth computer program code for identifying footnotes in the reflown marked-up content of the first file format; an eleventh computer program code for tagging the identified footnotes; a twelfth computer program code for positioning the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page; a thirteenth computer program code for positioning page breaks in the continuous page based on a configurable page height and the determined line breaks for positioning the tagged anchored floats and the tagged footnotes on a subsequent page on non-availability of the space on the current page; a fourteenth computer program code for grouping the marked-up content with the positioned anchored floats and the positioned footnotes on each page; and a fifteenth computer program code for inserting one or more of multiple pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content; and a sixteenth computer program code for rendering the grouped marked-up content with the inserted pagination elements in the reversible second file format, where the reversible second file format allows the marked-up content to be reversed to the first file format to restore the continuous page. The ninth computer program code positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page. The twelfth computer program code positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page.
The computer program product disclosed herein further comprises one or more additional computer program codes for performing additional steps that may be required and contemplated for transforming marked-up content in a first file format to a reversible second file format. In an embodiment, a single piece of computer program code comprising computer executable instructions performs one or more steps of the computer implemented method disclosed herein for transforming marked-up content in a first file format to a reversible second file format. The computer program codes comprising computer executable instructions are embodied on the non- transitory computer readable storage medium. The processor 601 of the client device 501 retrieves these computer executable instructions and executes them. When the computer executable instructions are executed by the processor 601, the computer executable instructions cause the processor 601 to perform the steps of the computer implemented method for transforming marked-up content of a first file format to a reversible second file format.
FIGS. 7A-7Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible second file format in edit and proof views. Consider an example where the file format transformation system (FFTS) 502 is configured as a software application on a client device 501 exemplarily illustrated in FIG. 5, for example, a personal computer, a laptop, a smart phone, a tablet computing device, etc. A user of the client device 501 may want to edit and review a technical document of, for example, a hypertext markup language (HTML) format that is viewed as a running continuous page. The user invokes the FFTS 502 on the client device 501 and loads the input HTML document into the FFTS 502. The FFTS 502 allows the user to view the input HTML document via a graphical user interface (GUI) 401 of the FFTS 502. FIG. 7A exemplarily illustrates a screenshot of an opening page of the loaded input HTML document without an edit window 402 in a right pane of the GUI 401. FIG. 7B exemplarily illustrates a screenshot of the opening page of the loaded input HTML document, showing the edit window 402 in the right pane of the GUI 401. The edit window 402 allows the user to edit the input HTML document or accept suggested changes made by other users to the input HTML document in an edit view exemplarily illustrated in FIG. 7B. FIG. 7C exemplarily illustrates a screenshot of the output HTML page transformed by the FFTS 502 to a reversible file format, showing a header 701 and a footer 702 entered on the opening page in a proof view. The FFTS 502 positions the marked-up content in an appropriate location close to their respective citations in the proof view. The opening page in the reversible file format can be reversed to the first file format in the edit view.
FIG. 7D exemplarily illustrates a screenshot without the edit window 402 in the right pane of the GUI 401, showing hyphenations 703 entered in a page of the input hypertext markup language (HTML) document. FIG. 7E exemplarily illustrates a screenshot showing the hyphenations 703 entered in the HTML page, The user can edit the HTML page using the edit window 402 in the right pane of the GUI 401 exemplarily illustrated in FIG. 7E. The edit window 402 allows the user to edit the hyphenated HTML page. FIG. 7F exemplarily illustrates a screenshot of the output HTML page with hyphenations 703 transformed by the FFTS 502 to a reversible file format, showing the header 701 entered in the hyphenated HTML page in a proof view.
FIG. 7G exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing floats, for example, figures 704, without the edit window 402 in the right pane of the GUI 401. FIG. 7H exemplarily illustrates a screenshot of the page of the input HTML document containing the figures 704, showing the edit window 402 in the right pane of the GUI 401. The edit window 402 allows the user to edit the input HTML page containing the figures 704. FIG. 71 exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701, the footer 702, and a page number 705 entered on the output HTML page in a proof view. The FFTS 502 positions the figures 704 in the appropriate location close to a respective citation in the proof view.
FIG. 7J exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing a float, for example, a table 706, without the edit window 402 in a right pane of the GUI 401. FIG. 7K exemplarily illustrates a screenshot of the page of the input HTML document containing the table 706, showing the edit window 402 in the right pane of the GUI 401. The edit window 402 allows the user to edit the HTML page containing the table 706. FIG. 7L exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701, the footer 702, and a page number 705 entered on the page in a proof view. The FFTS 502 positions the table 706 in the appropriate location close to a respective citation in the proof view.
FIG. 7M exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing footnotes 707, without the edit window 402 in a right pane of the GUI 401. FIG. 7N exemplarily illustrates a screenshot of the page of the input HTML document containing the footnotes 707, showing the edit window 402 in the right pane of the GUI 401. The edit window 402 allows the user to edit the page containing the footnotes 707. FIG. 70 exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701, the footer 702, a page number 705 entered on the page, and the footnotes 707 positioned in the footnote section below a footnote ruler 708 in a proof view.
FIG. 7P exemplarily illustrates a screenshot of an output HTML page transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing the header 701 and the footer 702 at the top and the bottom of the page respectively in a proof view. The output HTML page also contains a page number 705 and a footnote 707 positioned in the footnote section below a footnote ruler 708 in the proof view.
FIG. 7Q exemplarily illustrates a screenshot of output HTML pages transformed by the file format transformation system (FFTS) 502 exemplarily illustrated in FIG. 5, to a reversible file format, showing a page break 709 in a proof view. The FFTS 502 breaks the running continuous input HTML page into individual reversible file format pages containing a header 701 and a footer 702, and renders the output on the GUI 401.
It will be readily apparent that the various methods, algorithms, and computer programs disclosed herein may be implemented on computer readable media appropriately programmed for computing devices. As used herein, "computer readable media" refers to non-transitory computer readable media that participate in providing data, for example, instructions that may be read by a computer, a processor or a similar device. Non-transitory computer readable media comprise all computer readable media, for example, non-volatile media, volatile media, and transmission media, except for a transitory, propagating signal. Non- volatile media comprise, for example, optical discs or magnetic disks and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitutes a main memory. Volatile media comprise, for example, a register memory, a processor cache, a random access memory (RAM), etc. Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to a processor, etc. Common forms of computer readable media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a laser disc, a Blu-ray Disc® of the Blu-ray Disc Association, any magnetic medium, a compact disc-read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, punch cards, paper tape, any other physical medium with patterns of holes, a random access memory (RAM), a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other memory chip or cartridge, or any other medium from which a computer can read.
The computer programs that implement the methods and algorithms disclosed herein may be stored and transmitted using a variety of media, for example, the computer readable media in a number of manners. In an embodiment, hard- wired circuitry or custom hardware may be used, in place of, or in combination with, software instructions for implementation of the processes of various embodiments. Therefore, the embodiments are not limited to any specific combination of hardware. The computer program codes comprising computer executable instructions may be implemented in any programming language that runs on an internet browser, for example, Chrome™ of Google Inc., Firefox® of Mozilla Foundation, Safari® of Apple Inc., Internet Explorer® of Microsoft Corporation, etc., on any operating system. The computer program codes or software programs may be stored on or in one or more mediums as object code. Various aspects of the computer implemented method and the file format transformation system (FFTS) 502 disclosed herein may be implemented in a non-programmed environment comprising documents created, for example, in a hypertext markup language (HTML), an extensible markup language (XML), or other format that render aspects of a graphical user interface (GUI) or perform other functions, when viewed in a visual area or a window of a browser program.
Various aspects of the computer implemented method and the FFTS 502 disclosed herein may be implemented as programmed elements, or non-programmed elements, or any suitable
combination thereof. The computer program product disclosed herein comprises one or more computer program codes for implementing the processes of various embodiments. The computer implemented method and the FFTS 502 disclosed herein are not limited to a particular computer system platform, processor, or operating system. The foregoing examples have been provided merely for the purpose of explanation and are in no way to be construed as limiting of the computer implemented method and the file format transformation (FFTS) system 502 disclosed herein. While the computer implemented method and the FFTS 502 have been described with reference to various embodiments, it is understood that the words, which have been used herein, are words of description and illustration, rather than words of limitation. Further, although the computer implemented method and the FFTS 502 have been described herein with reference to particular means, materials, and embodiments, the computer implemented method and the FFTS 502 are not intended to be limited to the particulars disclosed herein; rather, the computer implemented method and the FFTS 502 extend to all functionally equivalent structures, methods and uses, such as are within the scope of the appended claims. Those skilled in the art, having the benefit of the teachings of this specification, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the computer implemented method and the FFTS 502 disclosed herein in their aspects.

Claims

CLAIMS We claim:
1. A computer implemented method for transforming marked-up content in a first file format to a reversible second file format, said method employing a file format transformation system deployed on a client device comprising at least one processor configured to execute computer program instructions for performing said method, said method comprising: receiving said marked-up content of said first file format by said file format
transformation system; reflowing said received marked-up content of said first file format into a continuous page having a configurable page width by said file format transformation system; identifying spaces and block elements in said reflown marked-up content of said first file format by said file format transformation system; generating and appending tags to said identified spaces and said identified block elements in said reflown marked-up content of said first file format by said file format
transformation system; for each of said identified spaces and said identified block elements: determining line breaks in said reflown marked-up content of said first file format based on preconfigured criteria associated with said appended tags by said file format-transformation^system and-tagging-said determined line-breaks by-said file- format transformation system; for each of said determined line breaks: identifying anchored floats in said reflown marked-up content of said first file format by said file format transformation system and tagging said identified anchored floats by said file format transformation system; positioning said tagged anchored floats on a current page by said file format transformation system based on availability of space for said tagged anchored floats on said current page; identifying footnotes in said reflown marked-up content of said first file format by said file format transformation system and tagging said identified footnotes by said file format transformation system; positioning said tagged footnotes at a footnote section on said current page by said file format transformation system based on availability of space for said tagged footnotes on said current page; positioning page breaks in said continuous page by said file format
transformation system based on a configurable page height and said determined line breaks for said positioning of said tagged anchored floats and said tagged footnotes on a subsequent page on non-availability of said space on said current page; grouping said marked-up content with said positioned anchored floats and said positioned footnotes on each page by said file format transformation system; and inserting one or more of a plurality of pagination elements on said each page containing said grouped marked-up content by said file format transformation system; and rendering said grouped marked-up content with said inserted one or more of said pagination elements in said reversible second file format by said file format
transformation system, wherein said reversible second file format allows said marked-up content to be reversed to said first file format to restore said continuous page.
2. The computer implemented method of claim 1 , wherein said tagged anchored floats are positioned proximal to associated float citations on said current page based on said availability of said space for said tagged anchored floats on said current page.
3. The computer implemented method of claim 1 , wherein said tagged footnotes are positioned proximal to associated footnote citations on said current page based on availability of space for said tagged footnotes on said current page.
4. The computer implemented method of claim 1 , wherein said first fi le format is one of a hypertext markup language format, an extensible hypertext markup language format, and an extensible markup language format.
5. The computer implemented method of claim 1 . wherei n said determined l ine breaks retain integrity of said reversible second file format by hyphenating and adj usting spaces in said rendered marked-up content.
6. The computer implemented method of claim 1 . further comprising handl ing grouped
elements comprising a float and a caption associated with said float in said reversible second file format at a position assigned in said marked-up content of said first file format to said float by said file format transformation system.
7. The computer implemented method of claim 1 , wherein said pagination elements comprise page numbers, a header, a footer, a footnote ruler, fillers, and any combination thereof.
8. The computer implemented method of claim 1 , further comprising hyphenation and
justification of said rendered marked-up content in said reversible second file format by said file format transformation system to provide kerning based on aesthetics.
9. The computer implemented method of claim 1 , wherein said rendered marked-up content in said reversible second file format is accessible on a plurality of browsers on a plurality of operating systems.
10. A system for transforming marked-up content i n a first file format to a rev ersible second fi le format, said system comprising: a non-transitory computer readable storage medium configured to store computer program instructions defined by modules of a file format transformation system; at least one processor communicatively coupled to said non-transitory computer readable storage medium, said at least one processor configured to execute said defined computer program instructions; and said file format transformation system comprising: a content reception module configured to receive said marked-up content of said first file format; a content reflow module configured to reflow said received marked-up content of said first file format into a continuous page having a configurable page width; a space and block identification module configured to identify spaces and block elements in said reflown marked-up content of said first file format; a tagging module configured to generate and append tags to said identified spaces and said identified block elements in said reflown marked-up content of said first file format; for each of said identified spaces and said identified block elements: a pagination element processing module configured to determine line breaks in said-reflown marked^up-content ofsaid-first file format-based on preconfigured criteria associated with said appended tags, wherein said tagging module is further configured to tag said determined line breaks; for each of said determined line breaks: said pagination element processing module further configured to identify anchored floats in said reflown marked-up content of said first file format, wherein said tagging module is further configured to tag said identified anchored floats; said pagination element processing module further configured to position said tagged anchored floats on a current page based on availability of space for said tagged anchored floats on said current page; said pagination element processing module further configured to identify footnotes in said reflown marked-up content of said first file format, wherein said tagging module is further configured to tag said identified footnotes; said pagination element processing module further configured to position said tagged footnotes at a footnote section on said current page based on availability of space for said tagged footnotes on said current page; said pagination element processing module further configured to position page breaks in said continuous page based on a configurable page height and said determined line breaks for said positioning of said tagged anchored floats and said tagged footnotes on a subsequent page on nonavailability of said space on said current page; a compiler configured to group said marked-up content with said positioned anchored floats and said positioned footnotes on each page; and said pagination element processing module further configured to insert one or more of a plurality of pagination elements on said each page containing said grouped marked-up content; and said compiler further configured to render said grouped marked-up content with said inserted one or more of said pagination elements in said reversible second file format, wherein said reversible second file format allows said marked-up content to be reversed to said first file format to restore said continuous page.
1 1. The system of claim 10, wherein said pagination element processing module is configured to position said tagged anchored floats proximal to associated float citations on said current page based on said availability of said space for said tagged anchored floats on said current page.
12. The system of claim 10, wherein said pagination element processing module is configured to position said tagged footnotes proximal to associated footnote citations on said current page based on said availability of said space for said tagged footnotes on said current page.
13. The system of claim 10, wherein said first file format is one of a hypertext markup language format, an extensible hypertext markup language format, and an extensible markup language format.
14. The system of claim 10, wherein said determined line breaks retain integrity of said
reversible second file format by hyphenating and adjusting spaces in said rendered marked- up content.
15. The system of claim 10, wherein said pagination element processing module is further
configured to handle grouped elements comprising a float and a caption associated with said float in said reversible second file format at a position assigned in said marked-up content of said first file format to said float.
16. The system of claim 10, wherein said pagination elements comprise page numbers, a header, a footer, a footnote ruler, fillers, and any combination thereof.
17. A computer program product comprising a non-transitory computer readable storage medium -having embodied thereon^computer-program-codes-comprising-instructions-executable-by-at- least one processor for transforming marked-up content in a first file format to a reversible second file format, said computer program codes comprising: a first computer program code for receiving said marked-up content of said first file format; a second computer program code for refiowing said received marked-up content of said first file format into a continuous page having a configurable page width; a third computer program code for identifying spaces and block elements in said reflown marked-up content of said first file format; a fourth computer program code for generating and appending tags to said identified spaces and said identified block elements in said reflown marked-up content of said first file format; for each of said identified spaces and said identified block elements: a fifth computer program code for determining line breaks in said reflown marked-up content of said first file format based on preconfigured criteria associated with said appended tags and a sixth computer program code for tagging said determined line breaks; for each of said determined line breaks: a seventh computer program code for identifying anchored floats in said reflown marked-up content of said first file format and an eight computer program code for tagging said identified anchored floats; a ninth computer program code for positioning said tagged anchored floats on a current page based on availability of space for said tagged anchored floats on said current page; a tenth computer program code for identifying footnotes in said reflown marked-up content of said first file format and an eleventh computer program code for tagging said identified footnotes; a twelfth computer program code for positioning said tagged footnotes at a footnote section on said current page based on availability of space for said tagged footnotes on said current page; a thirteenth computer program code for positioning page breaks in said
continuous page based on a configurable page height and said determined line breaks for said positioning of said tagged anchored floats and said tagged footnotes on a subsequent page on non-availability of said space on said current page; a fourteenth computer program code for grouping said marked-up content with said positioned anchored floats and said positioned footnotes on each page; and a fifteenth computer program code for inserting one or more of a plurality of pagination elements on said each page containing said grouped marked-up content; and a sixteenth computer program code for rendering said grouped marked-up content with said inserted one or more of said pagination elements in said reversible second file . format, wherein said reversible second file format allows said marked-up content to be reversed to said first file format to restore said continuous page.
18. The computer program product of claim 17, wherein said ninth computer program code
positions said tagged anchored floats proximal to associated float citations on said current page based on said availability of said space for said tagged anchored floats on said current page. The computer program product of claim 17, wherein said twelfth computer program code positions said tagged footnotes proximal to associated footnote citations on said current page based on-said-availability— of-said-space -for said-tagged-footnotes-on-said-current page.-
PCT/IN2016/000159 2015-07-01 2016-06-22 Transformation of marked-up content to a reversible file format for automated browser based pagination WO2017002130A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP16817386.2A EP3317780A4 (en) 2015-07-01 2016-06-22 Transformation of marked-up content to a reversible file format for automated browser based pagination
US15/551,292 US10157238B2 (en) 2015-07-01 2016-06-22 Transformation of marked-up content to a reversible file format for automated browser based pagination
US15/695,017 US10318614B2 (en) 2015-07-01 2017-09-05 Transformation of marked-up content into a file format that enables automated browser based pagination

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN3348CH2015 2015-07-01
IN3348/CHE/2015 2015-07-01

Publications (1)

Publication Number Publication Date
WO2017002130A1 true WO2017002130A1 (en) 2017-01-05

Family

ID=57608169

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2016/000159 WO2017002130A1 (en) 2015-07-01 2016-06-22 Transformation of marked-up content to a reversible file format for automated browser based pagination

Country Status (3)

Country Link
US (1) US10157238B2 (en)
EP (1) EP3317780A4 (en)
WO (1) WO2017002130A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3602352A4 (en) * 2017-03-30 2020-10-28 TNQ Books And Journals Private Limited Transformation of marked-up content into a file format that enables automated browser based pagination

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10769348B1 (en) * 2019-09-23 2020-09-08 Typetura Llc Dynamic typesetting
CN113723063B (en) * 2021-09-02 2023-06-13 四川启睿克科技有限公司 Method for converting RTF (real time transport format) into HTML (hypertext markup language) and realizing effect in PDF (portable document format) file

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6556217B1 (en) * 2000-06-01 2003-04-29 Nokia Corporation System and method for content adaptation and pagination based on terminal capabilities
US20050251742A1 (en) 2000-09-27 2005-11-10 Microsoft Corporation View templates for HTML source documents
US20120304051A1 (en) * 2011-05-27 2012-11-29 Diacritech Technologies Pvt Ltd Automation Tool for XML Based Pagination Process
US8499236B1 (en) 2010-01-21 2013-07-30 Amazon Technologies, Inc. Systems and methods for presenting reflowable content on a display
US20140281924A1 (en) 2013-03-14 2014-09-18 Aol Inc. Systems and methods for horizontally paginating html content

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5336124A (en) * 1992-09-24 1994-08-09 Garside Ted L Horizontal skinning and protection apparatus
US5789229A (en) * 1994-09-30 1998-08-04 Uab Research Foundation Stranded RNA virus particles
US5779154A (en) * 1996-10-23 1998-07-14 Martin; Blake T. Garden sprinkler adapter device
US7778954B2 (en) * 1998-07-21 2010-08-17 West Publishing Corporation Systems, methods, and software for presenting legal case histories
US6336124B1 (en) 1998-10-01 2002-01-01 Bcl Computers, Inc. Conversion data representing a document to other formats for manipulation and display
US7028258B1 (en) * 1999-10-01 2006-04-11 Microsoft Corporation Dynamic pagination of text and resizing of image to fit in a document
US6779154B1 (en) 2000-02-01 2004-08-17 Cisco Technology, Inc. Arrangement for reversibly converting extensible markup language documents to hypertext markup language documents
US6789229B1 (en) 2000-04-19 2004-09-07 Microsoft Corporation Document pagination based on hard breaks and active formatting tags
CA2393035A1 (en) * 2002-07-11 2004-01-11 Ibm Canada Limited-Ibm Canada Limitee Converting markup language files
US7024415B1 (en) * 2002-07-31 2006-04-04 Bellsouth Intellectual Property Corporation File conversion
US20060259524A1 (en) * 2003-03-17 2006-11-16 Horton D T Systems and methods for document project management, conversion, and filing
US7653876B2 (en) * 2003-04-07 2010-01-26 Adobe Systems Incorporated Reversible document format
US8442331B2 (en) * 2004-02-15 2013-05-14 Google Inc. Capturing text from rendered documents using supplemental information
US7496835B1 (en) * 2004-10-31 2009-02-24 Adobe Systems Incorporated Document generation from web pages
CN101055578A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 File content dredger based on rule
CN101055577A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 Collector capable of extending markup language
US20110131482A1 (en) * 2009-12-02 2011-06-02 Olive Software Inc. System and method for multi-channel publishing
US20150199314A1 (en) * 2010-10-26 2015-07-16 Google Inc. Editing Application For Synthesized eBooks
US20130334300A1 (en) * 2011-01-03 2013-12-19 Curt Evans Text-synchronized media utilization and manipulation based on an embedded barcode
US8700986B1 (en) * 2011-03-18 2014-04-15 Google Inc. System and method for displaying a document containing footnotes
BR112015005059A2 (en) * 2012-09-07 2017-07-04 American Chemical Soc automated composition appraiser
US9275017B2 (en) * 2013-05-06 2016-03-01 The Speed Reading Group, Chamber Of Commerce Number: 60482605 Methods, systems, and media for guiding user reading on a screen
WO2015184534A1 (en) * 2014-06-06 2015-12-10 Foulnes Services Corp. System and method for generating task-embedded documents

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6556217B1 (en) * 2000-06-01 2003-04-29 Nokia Corporation System and method for content adaptation and pagination based on terminal capabilities
US20050251742A1 (en) 2000-09-27 2005-11-10 Microsoft Corporation View templates for HTML source documents
US8499236B1 (en) 2010-01-21 2013-07-30 Amazon Technologies, Inc. Systems and methods for presenting reflowable content on a display
US20120304051A1 (en) * 2011-05-27 2012-11-29 Diacritech Technologies Pvt Ltd Automation Tool for XML Based Pagination Process
US20140281924A1 (en) 2013-03-14 2014-09-18 Aol Inc. Systems and methods for horizontally paginating html content

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ALFIE ABDUL-RAHMAN ET AL.: "Automatic Pagination of HTML Documents in a Web Browser", HP LABS TECHREPORTS, 24 July 2009 (2009-07-24), XP055342263, Retrieved from the Internet <URL:http://www.hpl.hp.com/techreports/2009/HPL-2009- 123.pdf> [retrieved on 20161007] *
DONALD E. KNUTH, COMPUTERS AND TYPESETTING, VOLUME A: THE TEXBOOK, 11 January 1996 (1996-01-11)
See also references of EP3317780A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3602352A4 (en) * 2017-03-30 2020-10-28 TNQ Books And Journals Private Limited Transformation of marked-up content into a file format that enables automated browser based pagination

Also Published As

Publication number Publication date
EP3317780A1 (en) 2018-05-09
US10157238B2 (en) 2018-12-18
EP3317780A4 (en) 2018-11-21
US20180239775A1 (en) 2018-08-23

Similar Documents

Publication Publication Date Title
US10318614B2 (en) Transformation of marked-up content into a file format that enables automated browser based pagination
EP2663932B1 (en) Systems, methods, and interfaces for display of inline content and block level content on an access device
US7770107B2 (en) Methods and systems for extracting and processing translatable and transformable data from XSL files
WO2011072434A1 (en) System and method for web content extraction
KR102574306B1 (en) dynamic typesetting
Kottwitz LaTeX beginner's guide
US10157238B2 (en) Transformation of marked-up content to a reversible file format for automated browser based pagination
McGrath HTML, CSS & JavaScript in easy steps
Sikos Web Standards: Mastering HTML5, CSS3, and XML
US10671801B2 (en) Markup code generator
WO2018179002A1 (en) Transformation of marked-up content into a file format that enables automated browser based pagination
US20240119218A1 (en) Device dependent rendering of pdf content
Turčić et al. Dynamic mathematical layout in e-books
CN111143749A (en) Webpage display method, device, equipment and storage medium
US11030387B1 (en) Device dependent rendering of PDF content including multiple articles and a table of contents
US9984053B2 (en) Replicating the appearance of typographical attributes by adjusting letter spacing of glyphs in digital publications
US20130031460A1 (en) Using a common input/output format to generate a page of an electronic document
McGrath HTML in easy steps
Hassan et al. The browser as a document composition engine
Chen et al. Transforming web pages to become standard-compliant through reverse engineering
Nakanishi et al. Adaptation of Journal Article Tag Suite XML for Japanese humanities papers
Dyer An examination of typographic standards and their relevance to contemporary user-centred web and application design
Nechitailenko Converting LaTeX to HTML5 and EPUB3: A case study
Youngblood et al. Web Design
Katoh et al. Reducing costs and expanding XML submissions with PDF to JATS conversion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16817386

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15551292

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016817386

Country of ref document: EP