Detailed Description
The existing online publication mode of papers is to convert the papers into a PDF (Portable Document Format) Format, and after a user downloads the papers to the local, the user cannot perform operations of retrieving and copying the papers, so that the user cannot interact with the papers.
In the paper online publishing scheme provided by the disclosure, a structured document is automatically generated by extracting the structured information in the paper, and then the structured document is converted into a page capable of being browsed by a browser. The page generated in this way includes all the contents in the paper, and is not inserted into the page in a PDF format or the like, but written in a text form. When the page is put into a network, a user can perform editing operations such as retrieval, copying and the like on the page, so that the interaction between the user and a thesis can be improved, and the user experience is improved.
FIG. 1 is a flow chart illustrating a method for online publication of a paper according to an exemplary embodiment of the present invention.
As shown in fig. 1, the method for online publication of a paper provided in this embodiment includes:
step 101, acquiring a paper to be published, and extracting structural information included in the paper to be published according to a preset rule.
The method provided by the embodiment may be executed by a device with a computing function, for example, the device may be a computer, and the computer may be a background server of a website or an application program.
Specifically, the user may upload the paper to be published through a website or an application. For example, a user can open a website capable of publishing papers, upload own papers to be published, and send the papers to be published to a background server of the website through a network. In addition, the user can also upload the paper to be published of the user through the application program installed on the terminal capable of being networked, such as a computer, a mobile phone, a tablet personal computer and the like, and the paper to be published can also be sent to the background server of the application program through the network in the mode.
When the method is actually applied, the papers to be published uploaded by the user are in Word format.
Further, the server may store the paper in a database after receiving the paper, and the publishing device executing the method provided by the embodiment may obtain the paper to be published from the database. The publishing device may be the same device as the server that receives the paper to be published or may be a different device. The database for storing the papers to be published can be arranged in the publishing equipment, can be arranged in a background server, and can also be arranged in other equipment.
Optionally, after storing the paper to be published, the paper may be screened to determine that the first payment of the paper meets the publishing condition, and if the first payment meets the publishing condition, the publishing device may obtain the paper to be published. If not, the paper to be published can be marked, and the publishing device does not acquire the marked paper to be published.
In practical application, the publishing device can obtain the paper to be published according to the time of storing the paper to be published in the database, and specifically, the paper to be published stored in the database at the front time can be obtained firstly, and then the paper to be published stored in the database at the back time can be obtained.
After the publishing device acquires the paper to be published, the structural information included in the paper to be published can be extracted. Structured information means that the information is analyzed and then decomposed into a plurality of components which are related to each other, and each component has a clear hierarchical structure. For example, for a paper, the structured information may include a title, an author, a body, and the like. In the extraction, the extraction can be performed according to a text hierarchical structure of the paper, and generally, the structure of the paper is similar, for example, the font of the topic is larger than the fonts of other parts, information of the author is written below the topic, and the like.
Therefore, the preset rules can be set according to the hierarchical structure of the paper, so that the publishing device can extract the structural information included in the paper to be published according to the preset rules.
Specifically, the hierarchical structure of different types of papers is not completely the same. Various extraction rules can be set in the preset rules for extracting different types of papers. Each extraction rule may have an identification of a paper type, e.g., a graduation paper, a journal paper, etc. When a user uploads a paper, the paper type can be selected, correspondingly, the publishing device can acquire the paper type selected by the user before extracting the structured information, determine an extraction rule in a preset rule according to the type, and extract the structured information included in the paper based on the extraction rule. The publishing device may extract content belonging to each part from the paper by extracting the structured information, for example, extracting content belonging to a title, content belonging to a body, content belonging to an author, and publishing the paper based on these contents.
And 102, generating a static page corresponding to the paper to be published according to the structural information.
Further, what is included in the structured information is what content in which part of the paper, and at this time, a static page that is finally used for presentation can be generated according to the content.
Wherein the static page is a page that can be displayed through a browser, in which a static resource can be set. When the browser loads the static page, the static resource is also loaded.
In practical application, tags, such as abstracts, texts, references and the like, can be set in the static page, and specific contents in the paper to be published are supplemented under the corresponding tags, for example, the abstract contents are supplemented at the positions of the tag abstracts. When loading the static page, corresponding static resources may be loaded according to the tags, where the static resources may include paper content corresponding to each tag.
Furthermore, a structured document can be generated according to the structured information, and then a static page can be generated based on the structured document. At this time, tags may be set in the structured document, the tags being used to indicate the document structure to which each part in the document belongs. And converting the label in the structured document into a label which can be identified by the browser, thereby obtaining the static page. The structured document may be in a document format that conforms to a common standard.
And 103, generating an access address corresponding to the static page so that the user can browse the paper to be published according to the access address.
In actual application, the publishing equipment can number all the generated static pages, so that the access address can be generated according to the numbers of the static pages; in another embodiment, the publishing device also numbers the paper to be published, and at this time, the access address can be generated according to the number of the published paper; in addition, the name of the paper to be published can be obtained, and the access address corresponding to the static page can be generated according to the name of the paper.
The access address may further include a domain name of the website, and may further include a number of a web page, a number of a paper to be published, or a paper name. These contents may be added to the access address according to the requirement, and the present embodiment does not limit this.
Specifically, an access address can be published to the network, so that the user can browse the content of the paper to be published through the access address. After a user opens a page corresponding to the access address in the browser, the generated static page can be displayed in the browser, and tags corresponding to the content types of all parts of the thesis can be displayed for the user to use.
The method provided by the embodiment is used for on-line publication of papers, and is executed by equipment provided with the method provided by the embodiment, and the equipment is generally realized in a hardware and/or software manner.
The method for on-line publication of the paper provided by the embodiment comprises the following steps: acquiring a paper to be published, and extracting structural information included in the paper to be published according to a preset rule; generating a static page corresponding to the paper to be published according to the structured information; and generating an access address corresponding to the static page so that the user can browse the paper to be published according to the access address. The method provided by the embodiment can automatically process the paper to be published, so that the paper to be published is published without manual participation. In addition, in the method provided by the embodiment, the paper to be published is displayed in the form of a static page, and compared with the PDF format in the prior art, the interactivity with the user can be improved, so that the user experience is improved.
Fig. 2 is a flowchart illustrating a method for online publication of a paper according to another exemplary embodiment of the present invention.
As shown in fig. 2, the method for online publishing a paper provided by this embodiment includes:
step 201, a paper to be published is obtained.
Step 201 is similar to the specific principle and implementation manner of obtaining the paper to be published in step 101, and is not described herein again.
Step 202, extracting the structural information included in the paper to be published according to a preset paper structure.
Wherein the structured information includes at least one of the following types:
title, author, abstract, text, chapter, diagram, table, footnote, reference.
In general, the structures of the papers to be published are similar, so that the general structure of the papers can be preset, and structured information can be extracted from the papers to be published based on the general structure. For example, where the first page of a paper includes title and author information, the publishing device may scan the first page of the paper to be published and extract the title and author information.
In addition, if the publishing device is used for publishing different types of papers, and the formats of the papers are related to the types of the papers, a plurality of paper structures can be preset, and the corresponding paper structures are used for extracting the structured information according to the types of the papers to be published. For example, a first paper type may include summary content in its structure, with the summary being on the second page of the paper to be published, while a second paper type may not include summary content in its structure. At this time, if the paper to be published is of a first type, the first paper structure is adopted, and if the paper to be published is of a second type, the second paper structure is adopted.
If the type of the structured information is a content type that may be inserted into the middle of the text, such as a diagram, a table, etc., the structured information may further include a location where the structured information is located, specifically, the location of the diagram or the table is represented by a text segment, for example, between segments 101 and 102.
Step 203, generating a structured document according to the structured information.
Wherein, a structured document conforming to a common standard can be generated according to the extracted structured information.
FIG. 2A is a diagram illustrating a structured document in accordance with an exemplary embodiment of the present invention.
In particular, tags may be set in the structured document to represent different structured areas in the structured document.
Further, in step 202, the structured information may be extracted according to the structure of the thesis, and therefore, the extracted structured information may carry the type of the information, for example, the extracted first part of the content is a title, the extracted second part of the content is an author, and the extracted third part of the content is a summary. The label corresponding to the structured information can be determined according to the type corresponding to the structured information; and marking the corresponding area of the structured information in the structured document according to the label.
In practical application, corresponding contents can be obtained according to the tags, then structured areas corresponding to the tags are formed through combination, and the structured areas corresponding to the tags are combined to form a structured document. The format of the structured document may be preset, and specifically includes the relative positions of the respective structured areas, for example, the first part is a title area, the second part is an author area, and the like.
And step 204, generating a static page corresponding to the paper to be published according to the structured document.
Wherein the tab in the structured document can be converted into a tab form recognizable by the browser, thereby forming a static page.
Fig. 2B is a schematic diagram of a static page according to an exemplary embodiment of the present invention.
Specifically, in addition to the existing tags in the structured document, additional tags can be added, which can facilitate the browsing experience of the user and the interaction capability with the page. The displayed page can be provided with a button, for example, a text button can be arranged for clicking the text and directly jumping to the text part of the paper.
Furthermore, the paper to be published is displayed through the static page, so that the user can copy the paper, and the paper to be published can be directly retrieved through the network when the user performs retrieval operation in the browser. Therefore, the thesis is published in a static page mode, and the interaction capacity of the user and the thesis can be improved.
In practice, software for converting static pages can be provided in the publishing device, based on which the structured document is converted into static pages.
Step 205, generating an access address corresponding to the static page, so that the user can browse the paper to be published according to the access address.
The specific principle and implementation of step 205 are similar to those of step 103, and are not described herein again.
The method for on-line publication of the paper provided by the embodiment comprises the following steps: acquiring a paper to be published, and extracting structural information included in the paper to be published according to a preset paper structure; generating a structured document according to the structured information, and converting the structured document into a static page; and generating an access address corresponding to the static page so that the user can browse the paper to be published according to the access address. The method provided by the embodiment can automatically process the paper to be published, so that the paper to be published is published without manual participation. In addition, in the method provided by the embodiment, the paper to be published is displayed in the form of a static page, and compared with the PDF format in the prior art, the interactivity with the user can be improved, so that the user experience is improved.
Fig. 3 is a block diagram illustrating an apparatus for online publication of a paper according to an exemplary embodiment of the present invention.
As shown in fig. 3, the device for on-line publication of a thesis provided in this embodiment includes:
the extraction module 31 is configured to acquire a paper to be published, and extract structural information included in the paper to be published according to a preset rule;
the webpage generating module 32 is configured to generate a static page corresponding to the paper to be published according to the structural information;
and the website generating module 33 is configured to generate an access address corresponding to the static page, so that the user can browse the paper to be published according to the access address.
The device for on-line publication of a paper provided by the embodiment comprises an extraction module, a storage module and a processing module, wherein the extraction module is used for acquiring the paper to be published and extracting structural information included in the paper to be published according to a preset rule; the webpage generating module is used for generating a static page corresponding to the paper to be published according to the structural information; and the website generating module is used for generating an access address corresponding to the static page so that the user can browse the paper to be published according to the access address. The device provided by the embodiment can automatically process the papers to be published, so that the papers to be published can be published without manual participation. In addition, in the device provided by the embodiment, the paper to be published is displayed in the form of a static page, and compared with the PDF format in the prior art, the interactivity with the user can be improved, so that the user experience is improved.
The specific principle and implementation of the online publication device for papers provided in this embodiment are similar to those of the embodiment shown in fig. 1, and are not described herein again.
Fig. 4 is a block diagram illustrating an apparatus for online publication of a paper according to another exemplary embodiment of the present invention.
As shown in fig. 4, on the basis of the foregoing embodiment, in the apparatus for online publishing a thesis provided in this embodiment, the extracting module 31 is specifically configured to:
extracting the structural information included in the paper to be published according to a preset paper structure;
the structured information includes at least one of the following types:
title, author, abstract, text, chapter, diagram, table, footnote, reference.
Optionally, the webpage generating module 32 includes:
a document generating unit 321 configured to generate a structured document according to the structured information;
a page generating unit 322, configured to generate the static page corresponding to the paper to be published according to the structured document.
Optionally, the document generating unit 321 is specifically configured to:
determining a label corresponding to the structural information according to the type corresponding to the structural information;
and marking the corresponding area of the structural information in the structural document according to the label.
Optionally, the website generating module 33 is specifically configured to:
and acquiring the number of the paper to be published, and generating an access address corresponding to the static page according to the number.
The online publication device for papers provided by the embodiment comprises: acquiring a paper to be published, and extracting structural information included in the paper to be published according to a preset paper structure; generating a structured document according to the structured information, and converting the structured document into a static page; and generating an access address corresponding to the static page so that the user can browse the paper to be published according to the access address. The device provided by the embodiment can automatically process the papers to be published, so that the papers to be published can be published without manual participation. In addition, in the method provided by the embodiment, the paper to be published is displayed in the form of a static page, and compared with the PDF format in the prior art, the interactivity with the user can be improved, so that the user experience is improved.
The specific principle and implementation of the online publication device for papers provided in this embodiment are similar to those of the embodiment shown in fig. 2, and are not described herein again.
Fig. 5 is a block diagram illustrating an apparatus for online publication of a paper according to an exemplary embodiment of the present invention.
As shown in fig. 5, the online publication device for papers provided in this embodiment includes:
a memory 51;
a processor 52; and
a computer program;
wherein said computer program is stored in said memory 51 and configured to be executed by said processor 52 to implement any of the methods of online publication of papers as described above.
The present embodiments also provide a computer-readable storage medium, having stored thereon a computer program,
the computer program is executed by a processor to implement any one of the methods for on-line publication of papers as described above.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.