CN110929474B - Display method, electronic equipment and medium for literary composition chapters - Google Patents

Display method, electronic equipment and medium for literary composition chapters Download PDF

Info

Publication number
CN110929474B
CN110929474B CN201911030123.9A CN201911030123A CN110929474B CN 110929474 B CN110929474 B CN 110929474B CN 201911030123 A CN201911030123 A CN 201911030123A CN 110929474 B CN110929474 B CN 110929474B
Authority
CN
China
Prior art keywords
garbage
chapter
nodes
node
chapters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911030123.9A
Other languages
Chinese (zh)
Other versions
CN110929474A (en
Inventor
朱文进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Hangzhou Co Ltd
Original Assignee
Vivo Mobile Communication Hangzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Hangzhou Co Ltd filed Critical Vivo Mobile Communication Hangzhou Co Ltd
Priority to CN201911030123.9A priority Critical patent/CN110929474B/en
Publication of CN110929474A publication Critical patent/CN110929474A/en
Application granted granted Critical
Publication of CN110929474B publication Critical patent/CN110929474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses a display method, electronic equipment and a medium of a literary work chapter, wherein the display method of the literary work chapter comprises the following steps: parsing a hypertext markup language, HTML, document of a first literature into a first document object model, DOM, tree, the first document object model, DOM, tree comprising N nodes of N chapters of the first literature; identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters; deleting the M target nodes from the DOM tree of the first document object model to obtain a second DOM tree; displaying T chapters of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t. By utilizing the embodiment of the invention, the garbage chapter can be filtered, and the garbage chapter is prevented from affecting the reading experience of the user.

Description

Display method, electronic equipment and medium for literary composition chapters
Technical Field
The embodiment of the invention relates to the field of electronic equipment, in particular to a display method of literary work chapters, electronic equipment and a medium.
Background
With the rapid development of networks, literature works (such as novels and comics) taking networks as carriers are rapidly developed. Authors can publish literary works on the network and update them continuously to allow readers to read through the network. The novel table is characterized by free style, unlimited text and simple publishing and reading modes.
Some literary works have some spam chapters, for example, the content of the spam chapters includes advertisements sent by authors, new book recommendations, etc., and the content of the spam chapters is irrelevant to the content of the literary works. However, the garbage section affects the reader's reading of the literary works very much.
Disclosure of Invention
The embodiment of the invention provides a display method of literary work chapters, which aims to solve the problem that the reading of a user is influenced by garbage chapters in literary works.
In order to solve the technical problems, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a method for displaying a literary composition chapter, including:
parsing a hypertext markup language, HTML, document of a first literature into a first document object model, DOM, tree, the first document object model, DOM, tree comprising N nodes of N chapters of the first literature;
identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters;
deleting the M target nodes from the DOM tree of the first document object model to obtain a second DOM tree;
displaying T chapters of the first literature based on the second DOM tree;
wherein N, M and T are both positive integers, n=m+t.
In a second aspect, an embodiment of the present invention provides an electronic device, including:
the document analysis module is used for analyzing the hypertext markup language (HTML) document of the first literature into a first Document Object Model (DOM) tree, wherein the first Document Object Model (DOM) tree comprises N nodes of N chapters of the first literature;
the target node identification module is used for identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters;
the node deleting module is used for deleting the M target nodes from the DOM tree of the first document object model to obtain a second DOM tree;
the chapter display module is used for displaying T chapters of the first literature based on the second DOM tree;
wherein N, M and T are both positive integers, n=m+t.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a memory, and a computer program stored in the memory and executable on the processor, where the computer program when executed by the processor implements the steps of the method for displaying a literary composition chapter.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing the steps of the method for displaying a section of a literary work.
In the embodiment of the invention, the garbage chapter is filtered by identifying the garbage chapter and deleting the target node indicating the garbage chapter from the DOM tree of the literary work, so that the garbage chapter is not included in the chapter displayed to the user, the garbage chapter is prevented from affecting the reading experience of the user, and the effect of purifying the literary work chapter is achieved.
Drawings
FIG. 1 is a flow diagram of a method of displaying a section of a literary composition according to one embodiment of the invention;
FIG. 2 illustrates a display interface schematic of an unfiltered garbage section of an embodiment of the present invention;
FIG. 3 shows a code schematic of a DOM tree of one embodiment of the invention;
FIG. 4 shows a schematic diagram of the structure of a DOM tree of one embodiment of the invention;
FIG. 5 shows a schematic diagram of the structure of a CSSOM in accordance with an embodiment of the invention;
FIG. 6 illustrates a schematic diagram of the structure of a rendering tree of one embodiment of the present invention;
FIG. 7 illustrates a display interface schematic of a filtered trash chapter in accordance with one embodiment of the present invention;
FIG. 8 illustrates an interface diagram of whether to turn on the garbage chapter filtering mode according to an embodiment of the present invention;
FIG. 9 illustrates a display interface schematic of an unfiltered trash section of another embodiment of the invention;
FIG. 10 illustrates a display interface diagram of a filtered trash chapter in accordance with another embodiment of the present invention;
FIG. 11 shows a block diagram of an electronic device of an embodiment of the invention;
fig. 12 is a schematic hardware structure of an electronic device implementing an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flow chart of a method for displaying a section of a literary work according to an embodiment of the invention. As shown in fig. 1, the method for displaying the literary composition chapter includes:
step 102, parsing the hypertext markup language HTML document of the first literature into a first document object model DOM tree, the first document object model DOM tree comprising N nodes of N chapters of the first literature.
The first literary composition may be a composition of various forms, such as novels, comics, proses, and the like. The novel may be a novel on a novel site in the internet, i.e., a novel. HTML (HyperText Markup Language ) is an application under standard general markup language. HTML is not a programming language, but a markup language (markup language), which is necessary for web page production. "hypertext" means that the page may contain pictures, links, even non-text elements such as music, programs, etc. The structure of the hypertext markup language (or hypertext markup language) includes a "header" portion that provides information about a web page, and a "body" portion that provides specific content of the web page.
DOM (Document Object Model) tree is a collection of DOM nodes, which can be viewed as a tree structure, which is referred to as a node tree. All nodes can be accessed through this tree, their content can be modified or deleted, and new elements can be created.
For example, when an HTML document is received, an interface as shown in fig. 2 is displayed according to the HTML document, and there is a garbage chapter in the latest chapter of the interface. The HTML document may be parsed into a DOM tree as shown in fig. 3, with various nodes on the DOM tree of fig. 3 including HTML nodes, body nodes, head nodes, div nodes, including the nodes of the various chapters of fig. 2. FIG. 4 shows a schematic structure of a DOM tree, and it can be seen from FIG. 4 that the DOM tree includes: a hypertext markup language (html) element, a body element, a head element, a meta element, a link element, a paragraph (P) element, a document partition (div) element, a generic inline container (span element), an image (img) element, a first text (hello), a second text (Web performance), and a third text (students). meta element represents any metadata information that cannot be represented by other HTML meta-related elements, such as link elements. span elements are generic row containers of phrase content and do not have any special semantics.
Step 104, identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters.
A garbage chapter refers to a useless chapter in a literary work, the content of the garbage chapter is irrelevant to the content of the literary work itself, for example, the garbage chapter includes: author ads, new book recommendations, new book sense and novel ads, and the like.
And 106, deleting M target nodes from the DOM tree of the first document object model to obtain a second DOM tree.
For example, with continued reference to fig. 2, the "Zhang Sanxin book preview, the" please expect "in the latest chapter in fig. 2 is a garbage chapter, and for the DOM tree in fig. 3, the node of the garbage chapter" Zhang Sanxin book preview, the "please expect" is deleted from the first document object model DOM tree, resulting in a second DOM tree.
Step 108, displaying T chapters of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t.
Wherein the second DOM Tree is combined with the cascading style sheet object model (Cascading Style Sheets Object Model, CSSOM) to generate a rendering Tree (Render Tree).
CSSOM and rendering tree are described below.
The CSSOM shown in fig. 5 includes: a body element, a paragraph element, a general line container (span element), an image (img) element, a fourth text (Font-size: 16 px), a fifth text (Font-size: 16px; font-weight: bold), a sixth text (Font-size: 16px; display: none), a seventh text (Font-size: 16px; color: red), an eighth text (Font-size: 16px; float: right).
Combining the DOM tree shown in FIG. 4 with the CSSOM shown in FIG. 5, a rendering tree shown in FIG. 6 is generated. FIG. 6 illustrates that the rendering tree includes: a body element, a paragraph (P) element, a document partition (div) element, an image (img) element, a fourth text (Font-size: 16 px), a fifth text (Font-size: 16px; font-weight: bold), a first text (hello), a third text (students), and an eighth text (Font-size: 16px; float: right).
In addition, after the second DOM tree and the cascading style sheet object model (Cascading Style Sheets Object Model, CSSOM) are combined to generate a rendering tree, layout is performed according to the generated rendering tree to obtain geometric information (including position and size) of the node, absolute pixels of the node are obtained according to the rendering tree and the geometric information obtained by layout, the absolute pixels of the node are sent to a graphics processor (Graphics Processing Unit, GPU) and displayed on a page, the final display effect is as shown in fig. 7, and the latest chapter part does not include garbage chapters in fig. 7.
In the embodiment of the invention, the garbage chapter is filtered by identifying the garbage chapter and deleting the node of the garbage chapter from the DOM tree of the literature work, so that the garbage chapter is not included in the chapter displayed to the user, the garbage chapter is prevented from affecting the reading experience of the user, and the effect of purifying the literature work chapter is achieved.
In one embodiment of the present invention, step 104 includes:
the following steps are performed for each of the N nodes: acquiring at least one keyword (the keyword may include a vocabulary or a phrase formed by a plurality of vocabularies) in a chapter title of a node (such as the bolded text in fig. 3 is the chapter title); calculating the occurrence frequency of each keyword in a pre-established garbage vocabulary library; and identifying whether the node is a target node according to the frequency of the at least one keyword.
The garbage vocabulary library may include repeated vocabulary, and the more frequently the keywords in the chapter title of the node appear in the garbage vocabulary library, the more likely the chapter indicated by the node is a garbage chapter. If a plurality of keywords exist in the chapter titles of the nodes, calculating the frequency of each keyword in the garbage vocabulary library, calculating the average frequency of all the keywords in the chapter titles in the garbage vocabulary library, and identifying whether the nodes are target nodes for indicating garbage chapters according to the average frequency.
By calculating the occurrence frequency of keywords in chapter titles of nodes in the garbage vocabulary library, whether the nodes are target nodes for indicating garbage chapters can be accurately identified, and therefore accurate filtering of the garbage chapters in literary works is achieved.
In one embodiment of the present invention, step 104 includes:
the following steps are performed for each of the N nodes: acquiring at least one keyword in a chapter title of a node; calculating the similarity between each keyword and words in a pre-established garbage vocabulary library; and identifying whether the node is a target node according to the similarity of at least one keyword.
For example, calculating the similarity of each keyword to words in the pre-established garbage vocabulary library includes: and calculating the distance between the keywords and the words in the garbage vocabulary library, and taking the distance as the similarity. The higher the similarity between the keywords in the chapter titles of the nodes and the words in the garbage vocabulary library, the more likely the chapters indicated by the nodes are garbage chapters. If a plurality of keywords exist in the chapter titles, calculating the similarity between each keyword and the vocabulary in the garbage vocabulary library, calculating the average similarity between the plurality of vocabularies in the chapter titles of the nodes, and identifying whether the chapter indicated by the node is the garbage chapter according to the average similarity.
By calculating the similarity between the keywords in the chapter titles of the nodes and the vocabulary in the garbage vocabulary library, whether the nodes are target nodes for indicating garbage chapters can be accurately identified, so that the garbage chapters in literary works can be accurately filtered.
In one embodiment of the present invention, step 104 includes:
the following steps are performed for each of the N nodes: acquiring at least one keyword in a chapter title of a node; calculating the occurrence frequency of each keyword in a pre-established garbage vocabulary library and the similarity of each keyword and words in the garbage vocabulary library; and identifying whether the node is a target node according to the frequency and the similarity.
The frequency of occurrence of the keywords in the chapter titles of the nodes in the garbage vocabulary library and the similarity between the keywords in the chapter titles and the vocabulary in the garbage vocabulary library are calculated, and whether the node is a target node or not is identified by combining the calculated frequency and similarity, so that the accuracy of the identification result is further ensured.
In one embodiment of the present invention, identifying whether a node is a target node based on frequency and similarity includes:
calculating a target indicated value according to the frequency and the similarity of the nodes, wherein the target indicated value is used for measuring whether a chapter indicated by the nodes is a garbage chapter or not; in the case where the target instruction value is greater than the predetermined threshold value, it is determined that the node is the target node.
By calculating a target indication value for measuring whether the chapter indicated by the node is a garbage chapter, if the target indication value is larger than a preset threshold value, the chapter indicated by the node is the garbage chapter, namely the node is the target node which needs to be deleted from the DOM tree of the first document object model, so that the target node for indicating the garbage chapter is accurately found.
In one embodiment of the present invention, before step 104, the method for displaying the literature chapters further includes: constructing a garbage vocabulary library, wherein the garbage vocabulary library is constructed by adopting at least one of the following modes:
mode one
And crawling the second literature, and updating the garbage vocabulary library according to the chapter title of the second literature under the condition that the chapter characteristic vocabulary is not included in the chapter title of the second literature. Wherein updating the garbage vocabulary library comprises: keywords in chapter titles are added to the garbage vocabulary library.
For example, a crawler crawls a novel site to crawl a second literary work, assuming that the section title of the second literary work is "Zhang Sanxin book bulletin, please expect-! The chapter title does not include chapter feature words "chapter no", "chapter no" may be any one of the words, and thus the phrases "new book preview", "request for request" in the chapter title may be added to the garbage vocabulary library.
The garbage vocabulary library is updated according to the crawled literary works by crawling the literary works on the network, and the number of the literary works on the network is numerous, so that the garbage vocabulary in the updated garbage vocabulary library is also richer.
Mode two
Receiving junk chapter information sent by at least one electronic device; and updating the garbage vocabulary library according to the garbage chapter information. Wherein updating the garbage vocabulary library comprises: and adding the garbage vocabulary or phrases in the garbage chapter information into a garbage vocabulary library.
For example, the spam chapter information sent by the electronic device includes "update advance notice-15 points update first chapter, request for request" and "update advance notice", "update first chapter" and "request for request" are added to the spam vocabulary library, where "update first chapter" may be any term.
The garbage vocabulary library is updated through the garbage chapter information reported by the user, and because the garbage chapter reported by the user is usually the garbage chapter in practice, the problem that vocabularies which are not garbage chapters are added into the garbage vocabulary library is avoided, so that the constructed garbage vocabulary library is more accurate.
In one embodiment of the present invention, before identifying the garbage chapter corresponding to the node on the DOM tree, the method for displaying the literature chapter further includes:
and judging whether to start a garbage chapter filtering mode.
Wherein step 104 comprises:
in the case where the garbage chapter filtering mode has been turned on, M target nodes among the N nodes are identified.
According to the embodiment of the invention, the user can start the garbage chapter filtering mode according to the actual requirements of the user, so that the requirements of reading literary works of different users are met.
In one embodiment of the present invention, after determining whether to start the garbage chapter filtering mode, the method for displaying a literary composition chapter further includes:
and displaying N chapters of the first literature according to the DOM tree of the first document object model under the condition that the garbage chapter filtering mode is not started.
According to the embodiment of the invention, the user can close the garbage chapter filtering mode according to the actual requirement of the user so as to display the chapter including the garbage chapter of the first literary work, thereby meeting the requirements of different users for reading literary works.
For example, as shown in fig. 8, if it is detected that the trash chapter filtering button is triggered, an interface is displayed whether to open the trash chapter filtering mode, on which the user can select to open or close the trash chapter filtering mode according to his own needs.
For another example, in the case that the garbage chapter filtering mode is not started, an interface as shown in fig. 2 is displayed, and the user clicks on the chapter catalog on the interface, that is, receives an input of viewing the chapter catalog, and in response to the input, the chapter catalog as shown in fig. 9 is displayed, where the chapter catalog includes garbage chapters. After the user sees the garbage chapter, the garbage chapter filtering mode may be started to filter out the garbage chapter in the chapter directory shown in fig. 9, and the chapter directory shown in fig. 10 is displayed.
In one embodiment of the present invention, before step 102, the method for displaying a literary composition chapter further includes:
receiving a first input of a user to a first literary work; in response to the first input, sending a request message to the server to obtain an HTML document for the first literary work; and receiving and storing the HTML document of the first literary work sent by the server according to the request message.
The first input may be a click input, a long press input, a slide input, or the like. For example, receiving input of reading the novel on a browser interface, and sending a request message for acquiring the HTML document of the novel to a server of the novel site; a Response (Response) sent by the server of the novice site is received, the Response encapsulating the HTML document of the novice.
According to the embodiment of the invention, under the condition that a user needs to read the literary works, the server can be requested to send the HTML document of the literary works needing to be read. And the HTML document fed back by the server is utilized to read the literary works, so that the requirement of a user for reading the literary works is met.
From the above embodiment, it can be seen that by parsing the HTML document into a first DOM tree, identifying a target node in the first DOM tree indicating a spam section using a spam vocabulary library, and deleting the target node from the first DOM tree, a second DOM tree is obtained, i.e. the second DOM tree does not include the target node indicating a spam section. Therefore, the chapter displayed based on the second DOM tree does not comprise a garbage chapter, so that the influence of the garbage chapter on the reading experience of a user is avoided, the effect of purifying the literature chapter is achieved, a purified page effect is provided for the reading user of the literature, and a good atmosphere for reading the literature can be created.
FIG. 11 shows a block diagram of an electronic device of one embodiment of the invention. As shown in fig. 11, the electronic apparatus 200 includes:
the document parsing module 202 is configured to parse the hypertext markup language HTML document of the first literature into a first document object model DOM tree, where the first document object model DOM tree includes N nodes of N chapters of the first literature.
The target node identifying module 204 is configured to identify M target nodes among the N nodes, where the target nodes indicate garbage chapters among the N chapters.
The node deleting module 206 is configured to delete the M target nodes from the DOM tree of the first document object model, to obtain a second DOM tree.
A first section display module 208, configured to display T sections of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t.
In the embodiment of the invention, the garbage chapter is filtered by identifying the garbage chapter and deleting the node of the garbage chapter from the DOM tree of the literature work, so that the garbage chapter is not included in the chapter displayed to the user, the garbage chapter is prevented from affecting the reading experience of the user, and the effect of purifying the literature work chapter is achieved.
In one embodiment of the invention, the target node identification module 204 includes:
and the first keyword acquisition module is used for acquiring at least one keyword in the chapter title of the node.
And the frequency calculation module is used for calculating the frequency of each keyword in the pre-established garbage vocabulary library.
And the first identification module is used for identifying whether the node is a target node according to the frequency of at least one keyword.
In one embodiment of the invention, the target node identification module 204 includes:
and the second keyword acquisition module is used for acquiring at least one keyword in the chapter title of the node.
And the similarity calculation module is used for calculating the similarity between each keyword and the words in the pre-established garbage vocabulary library.
And the second identification module is used for identifying whether the node is a target node according to the similarity of at least one keyword.
In one embodiment of the invention, the target node identification module 204 includes:
and the third keyword acquisition module is used for acquiring at least one keyword in the chapter title of the node.
And the parameter calculation module is used for calculating the occurrence frequency of each keyword in the pre-established garbage vocabulary library and the similarity of each keyword and the words in the garbage vocabulary library.
And the third identifying module is used for identifying whether the node is a target node according to the frequency and the similarity.
In one embodiment of the invention, the third identification module comprises:
and the target value calculation module is used for calculating a target indicated value according to the frequency and the similarity of the nodes, wherein the target indicated value is used for measuring whether the chapter indicated by the nodes is a garbage chapter or not.
And the determining node module is used for determining that the node is the target node in the condition that the target indicated value is larger than a preset threshold value.
In one embodiment of the invention, the electronic device 200 further comprises:
the crawling module is used for crawling the second literary works;
and the first vocabulary library updating module is used for updating the garbage vocabulary library according to the chapter title of the second literature work under the condition that the chapter characteristic vocabulary is not included in the chapter title of the second literature work.
In one embodiment of the invention, the electronic device 200 further comprises:
and the junk information receiving module is used for receiving junk chapter information sent by at least one electronic device.
And the second vocabulary library updating module is used for updating the garbage vocabulary library according to the garbage chapter information.
In one embodiment of the invention, the electronic device 200 further comprises:
and the mode judging module is used for judging whether to start the garbage chapter filtering mode.
The target node identification module is used for identifying M target nodes in N nodes under the condition that the garbage chapter filtering mode is started.
In one embodiment of the invention, the electronic device 200 further comprises:
and the second chapter display module is used for displaying N chapters of the first literature work according to the DOM tree of the first document object model under the condition that the garbage chapter filtering mode is closed.
In one embodiment of the invention, the electronic device 200 further comprises:
and the first input receiving module is used for receiving a first input of a first literary work by a user.
And the first input response module is used for responding to the first input and sending a request message for acquiring the HTML document of the first literary work to the server.
And the HTML document receiving module is used for receiving the HTML document of the first literary work sent by the server according to the request message.
And the HTML document storage module is used for storing the HTML document of the first literary work.
From the above embodiment, it can be seen that by parsing the HTML document into a first DOM tree, identifying a target node in the first DOM tree indicating a spam section using a spam vocabulary library, and deleting the target node from the first DOM tree, a second DOM tree is obtained, i.e. the second DOM tree does not include the target node indicating a spam section. Therefore, the chapter displayed based on the second DOM tree does not comprise a garbage chapter, so that the influence of the garbage chapter on the reading experience of a user is avoided, the effect of purifying the literature chapter is achieved, a purified page effect is provided for the reading user of the literature, and a good atmosphere for reading the literature can be created.
Fig. 12 is a schematic hardware structure of an electronic device implementing an embodiment of the present invention, where the electronic device 300 includes, but is not limited to: radio frequency unit 301, network module 302, audio output unit 303, input unit 304, sensor 305, display unit 306, user input unit 307, interface unit 308, memory 309, processor 310, and power supply 311. Those skilled in the art will appreciate that the electronic device structure shown in fig. 12 is not limiting of the electronic device and that the electronic device may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. In the embodiment of the invention, the electronic equipment comprises, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer and the like.
Wherein the processor 310 is configured to parse the hypertext markup language HTML document of the first literature into a first document object model DOM tree, the first document object model DOM tree including N nodes of N chapters of the first literature; identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters; deleting M target nodes from the DOM tree of the first document object model to obtain a second DOM tree; displaying the T chapters of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t.
In the embodiment of the invention, the garbage chapter is filtered by identifying the garbage chapter and deleting the node corresponding to the garbage chapter from the DOM tree of the literary work, so that the garbage chapter is not included in the chapter displayed to the user, the garbage chapter is prevented from affecting the reading experience of the user, and the effect of purifying the literary work chapter is achieved.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 301 may be used to receive and send information or signals during a call, specifically, receive downlink data from a base station, and then process the downlink data with the processor 310; and, the uplink data is transmitted to the base station. Typically, the radio frequency unit 301 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 301 may also communicate with networks and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user through the network module 302, such as helping the user to send and receive e-mail, browse web pages, and access streaming media, etc.
The audio output unit 303 may convert audio data received by the radio frequency unit 301 or the network module 302 or stored in the memory 309 into an audio signal and output as sound. Also, the audio output unit 303 may also provide audio output (e.g., a call signal reception sound, a message reception sound, etc.) related to a specific function performed by the electronic device 300. The audio output unit 303 includes a speaker, a buzzer, a receiver, and the like.
The input unit 304 is used to receive an audio or video signal. The input unit 304 may include a graphics processor (Graphics Processing Unit, GPU) 3041 and a microphone 3042, the graphics processor 3041 processing image data of still pictures or video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 306. The image frames processed by the graphics processor 3041 may be stored in the memory 309 (or other storage medium) or transmitted via the radio frequency unit 301 or the network module 302. The microphone 3042 may receive sound, and may be capable of processing such sound into audio data. The processed audio data may be converted into a format output that can be transmitted to the mobile communication base station via the radio frequency unit 301 in the case of a telephone call mode.
The electronic device 300 further comprises at least one sensor 305, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 3061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 3061 and/or the backlight when the electronic device 300 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for recognizing the gesture of the electronic equipment (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; the sensor 305 may further include a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, etc., which are not described herein.
The display unit 306 is used to display information input by a user or information provided to the user. The display unit 306 may include a display panel 3061, and the display panel 3061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 307 may be used to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 307 includes a touch panel 3071 and other input devices 3072. The touch panel 3071, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 3071 or thereabout the touch panel 3071 using any suitable object or accessory such as a finger, stylus, or the like). The touch panel 3071 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 310, and receives and executes commands sent by the processor 310. In addition, the touch panel 3071 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The user input unit 307 may include other input devices 3072 in addition to the touch panel 3071. Specifically, other input devices 3072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.
Further, the touch panel 3071 may be overlaid on the display panel 3061, and when the touch panel 3071 detects a touch operation thereon or thereabout, the touch operation is transmitted to the processor 310 to determine a type of touch event, and then the processor 310 provides a corresponding visual output on the display panel 3061 according to the type of touch event. Although in fig. 12, the touch panel 3071 and the display panel 3061 are two independent components for implementing the input and output functions of the electronic device, in some embodiments, the touch panel 3071 and the display panel 3061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.
The interface unit 308 is an interface to which an external device is connected to the electronic apparatus 300. For example, the external devices may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 308 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the electronic apparatus 300 or may be used to transmit data between the electronic apparatus 300 and an external device.
Memory 309 may be used to store software programs as well as various data. The memory 309 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and a storage data area; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 309 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 310 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 309, and calling data stored in the memory 309, thereby performing overall monitoring of the electronic device. Processor 310 may include one or more processing units; preferably, the processor 310 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 310.
The electronic device 300 may also include a power supply 311 (e.g., a battery) for powering the various components, and preferably the power supply 311 may be logically coupled to the processor 310 via a power management system that performs functions such as managing charge, discharge, and power consumption.
In addition, the electronic device 300 includes some functional modules, which are not shown, and will not be described herein.
The embodiment of the invention also provides electronic equipment, which comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes all the processes of the display method embodiment of the literary composition chapter when being executed by the processor, can achieve the same technical effect, and is not repeated here for avoiding repetition.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the display method embodiments of literary composition chapters and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims (9)

1. A method of displaying a section of a literary composition, comprising:
parsing a hypertext markup language, HTML, document of a first literature into a first document object model, DOM, tree, the first document object model, DOM, tree comprising N nodes of N chapters of the first literature;
identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters;
deleting the M target nodes from the DOM tree of the first document object model to obtain a second DOM tree;
displaying T chapters of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t;
the identifying M target nodes of the N nodes includes: the following steps are performed for each of the N nodes:
acquiring at least one keyword in a chapter title of the node; calculating the occurrence frequency of each keyword in a pre-established garbage vocabulary library and the similarity of each keyword and words in the garbage vocabulary library, wherein the garbage vocabulary library contains repeated vocabularies; and identifying whether the node is the target node according to the frequency and the similarity.
2. The method of claim 1, wherein said identifying whether said node is said target node based on said frequency and said similarity comprises:
calculating a target indicated value according to the frequency and the similarity of the node, wherein the target indicated value is used for measuring whether a chapter indicated by the node is a garbage chapter or not;
and determining that the node is the target node in the case that the target indicated value is greater than a predetermined threshold.
3. The method of claim 1, wherein prior to said identifying M target nodes of said N nodes, the method further comprises:
crawling a second literature;
and under the condition that the chapter title of the second literature does not comprise chapter feature vocabularies, updating the garbage vocabulary library according to the chapter title of the second literature.
4. The method of claim 1, wherein prior to said identifying M target nodes of said N nodes, the method further comprises:
receiving junk chapter information sent by at least one electronic device;
and updating the garbage vocabulary library according to the garbage chapter information.
5. The method of claim 1, wherein prior to identifying the garbage section corresponding to a node on the DOM tree, the method further comprises:
judging whether to start a garbage chapter filtering mode;
wherein said identifying M target nodes of said N nodes comprises:
the M target nodes of the N nodes are identified if the garbage chapter filtering mode has been turned on.
6. The method of claim 5, wherein after the determining whether to turn on the garbage chapter filtering mode, the method further comprises:
and under the condition that the garbage chapter filtering mode is closed, displaying N chapters of the first literary work according to the DOM tree of the first document object model.
7. The method of claim 1, wherein prior to parsing the hypertext markup language HTML document of the first literary work into the first document object model DOM tree, the method further comprises:
receiving a first input of a user to the first literary work;
in response to the first input, sending a request message to a server to obtain an HTML document for the first literary work;
and receiving and storing the HTML document of the first literary work sent by the server according to the request message.
8. An electronic device, comprising:
the document analysis module is used for analyzing the hypertext markup language (HTML) document of the first literature into a first Document Object Model (DOM) tree, wherein the first Document Object Model (DOM) tree comprises N nodes of N chapters of the first literature;
the target node identification module is used for identifying M target nodes in the N nodes, wherein the target nodes indicate garbage chapters in the N chapters;
the node deleting module is used for deleting the M target nodes from the DOM tree of the first document object model to obtain a second DOM tree;
the chapter display module is used for displaying T chapters of the first literature based on the second DOM tree; wherein N, M and T are both positive integers, n=m+t;
the target node identification module is specifically configured to:
the following steps are performed for each of the N nodes:
acquiring at least one keyword in a chapter title of the node; calculating the occurrence frequency of each keyword in a pre-established garbage vocabulary library and the similarity of each keyword and words in the garbage vocabulary library, wherein the garbage vocabulary library contains repeated vocabularies; and identifying whether the node is the target node according to the frequency and the similarity.
9. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program when executed by the processor implementing the steps of the method of displaying literary composition chapters of any one of claims 1 to 7.
CN201911030123.9A 2019-10-28 2019-10-28 Display method, electronic equipment and medium for literary composition chapters Active CN110929474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911030123.9A CN110929474B (en) 2019-10-28 2019-10-28 Display method, electronic equipment and medium for literary composition chapters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911030123.9A CN110929474B (en) 2019-10-28 2019-10-28 Display method, electronic equipment and medium for literary composition chapters

Publications (2)

Publication Number Publication Date
CN110929474A CN110929474A (en) 2020-03-27
CN110929474B true CN110929474B (en) 2023-10-20

Family

ID=69849620

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911030123.9A Active CN110929474B (en) 2019-10-28 2019-10-28 Display method, electronic equipment and medium for literary composition chapters

Country Status (1)

Country Link
CN (1) CN110929474B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282811A (en) * 2021-05-27 2021-08-20 广州文石信息科技有限公司 MOBI document display method, device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315953A (en) * 2010-06-29 2012-01-11 百度在线网络技术(北京)有限公司 Method and device for detecting junk posts based on occurrence rule of posts
CN102779170A (en) * 2012-06-25 2012-11-14 北京奇虎科技有限公司 System and method for identifying text floor of webpage
CN104216872A (en) * 2013-05-31 2014-12-17 腾讯科技(深圳)有限公司 Method and device for identifying rubbish chapters in network novels
CN106445967A (en) * 2015-08-11 2017-02-22 腾讯科技(深圳)有限公司 Resource directory management method and apparatus
CN107025247A (en) * 2016-02-02 2017-08-08 广州市动景计算机科技有限公司 Method, equipment, browser and the electronic equipment handled web data
CN110377884A (en) * 2019-06-13 2019-10-25 北京百度网讯科技有限公司 Document analytic method, device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102315953A (en) * 2010-06-29 2012-01-11 百度在线网络技术(北京)有限公司 Method and device for detecting junk posts based on occurrence rule of posts
CN102779170A (en) * 2012-06-25 2012-11-14 北京奇虎科技有限公司 System and method for identifying text floor of webpage
CN104216872A (en) * 2013-05-31 2014-12-17 腾讯科技(深圳)有限公司 Method and device for identifying rubbish chapters in network novels
CN106445967A (en) * 2015-08-11 2017-02-22 腾讯科技(深圳)有限公司 Resource directory management method and apparatus
CN107025247A (en) * 2016-02-02 2017-08-08 广州市动景计算机科技有限公司 Method, equipment, browser and the electronic equipment handled web data
CN110377884A (en) * 2019-06-13 2019-10-25 北京百度网讯科技有限公司 Document analytic method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110929474A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN108470041B (en) Information searching method and mobile terminal
CN107958042B (en) Target topic pushing method and mobile terminal
CN111274777B (en) Thinking guide display method and electronic equipment
CN109561211B (en) Information display method and mobile terminal
CN110989847B (en) Information recommendation method, device, terminal equipment and storage medium
CN111556371A (en) Note recording method and electronic equipment
CN107992615B (en) Website recommendation method, server and terminal
CN110990679A (en) Information searching method and electronic equipment
CN111125307A (en) Chat record query method and electronic equipment
CN111510557B (en) Content processing method and electronic equipment
CN109063076B (en) Picture generation method and mobile terminal
CN108595107B (en) Interface content processing method and mobile terminal
CN109063079B (en) Webpage labeling method and electronic equipment
CN110932964A (en) Information processing method and device
CN110929474B (en) Display method, electronic equipment and medium for literary composition chapters
CN111143614A (en) Video display method and electronic equipment
CN111460180B (en) Information display method, information display device, electronic equipment and storage medium
CN110826098B (en) Information processing method and electronic equipment
CN111290673B (en) Message processing method and electronic equipment
CN112395524A (en) Method, device and storage medium for displaying word annotation and paraphrase
CN109670105B (en) Searching method and mobile terminal
CN112445967B (en) Information pushing method and device, readable storage medium and information pushing system
CN110032320B (en) Page rolling control method and device and terminal
CN112188115A (en) Image processing method, electronic device and storage medium
CN108804615B (en) Sharing method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant