CN109948095A - Show method, apparatus, terminal and the storage medium of web page contents - Google Patents

Show method, apparatus, terminal and the storage medium of web page contents Download PDF

Info

Publication number
CN109948095A
CN109948095A CN201711202503.7A CN201711202503A CN109948095A CN 109948095 A CN109948095 A CN 109948095A CN 201711202503 A CN201711202503 A CN 201711202503A CN 109948095 A CN109948095 A CN 109948095A
Authority
CN
China
Prior art keywords
webpage
web page
content
node
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711202503.7A
Other languages
Chinese (zh)
Other versions
CN109948095B (en
Inventor
张枫枫
孟德全
胡晶晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201711202503.7A priority Critical patent/CN109948095B/en
Publication of CN109948095A publication Critical patent/CN109948095A/en
Application granted granted Critical
Publication of CN109948095B publication Critical patent/CN109948095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of method, apparatus and storage medium for showing web page contents, belong to network technique field.Method includes: the web page element for obtaining the first webpage to be shown;According to the label of the web page element, the non-textual content and content of text of first webpage are determined from the web page contents of first webpage;Display meets the second webpage of preset format, and second webpage includes the non-textual content and content of text of first webpage.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, therefore, when showing the second webpage, it can not only show the content of text of the first webpage, it can also show the non-textual content of the first webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out, accuracy is improved.

Description

Show method, apparatus, terminal and the storage medium of web page contents
Technical field
The present invention relates to network technique field, in particular to a kind of method, apparatus, terminal and storage for showing web page contents Medium.
Background technique
User consults the web page contents in next time when reading information class web page contents, for the ease of user, many to apply Software (APP) increases collection function.The collection function of application software can not only collect the web page contents in this application software, The web page contents in other application software can also be collected.Since the typesetting style of the webpage of each application software is different, in order to Realize unified style, when user checks the web page contents of some webpage in the application software, terminal to the web page contents into Row typesetting again, the web page contents after showing typesetting.
Currently, carrying out typesetting to the web page contents by the server of the application, which can be with are as follows: has received when user reads When the web page contents of some webpage of hiding, terminal to server sends the web page address of the webpage;Server is according to the webpage Location pulls the web page contents of the webpage;The content of text for extracting the web page contents, according to preset format, in the text of extraction Hold and carry out typesetting, the content of text after typesetting is sent to terminal.Terminal receives and shows the content of text after typesetting.
In the implementation of the present invention, the inventor finds that the existing technology has at least the following problems:
Server can only extract the content of text in the web page contents in the above method, carry out typesetting to content of text. Due to being only extracted content of text, and filtered out non-textual content, so as to cause after typesetting web page contents and typesetting before Web page contents difference is larger namely accuracy is poor.
Summary of the invention
The present invention provides a kind of method, apparatus, terminal and storage mediums for showing web page contents, and it is quasi- to can solve display The problem of true property difference.Technical solution is as follows:
On the one hand, the present invention provides a kind of methods for showing web page contents, which comprises
Obtain the web page element of the first webpage to be shown;
According to the label of the web page element, the non-of first webpage is determined from the web page contents of first webpage Content of text and content of text;
Display meets the second webpage of preset format, second webpage include first webpage non-textual content and Content of text.
On the one hand, the present invention provides a kind of methods of display collection web page contents, which comprises
It shows the collection entry of at least one webpage of collection, includes any webpage in the collection entry of any webpage Web page address;
According to the web page address of the first selected webpage, the web page element of first webpage is obtained;
According to the label of the web page element, the non-of first webpage is determined from the web page contents of first webpage Content of text and content of text;
Display meets the second webpage of preset format, second webpage include first webpage non-textual content and Content of text.
On the one hand, the present invention provides a kind of device for showing web page contents, described device includes:
Module is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module determines institute for the label according to the web page element from the web page contents of first webpage State the non-textual content and content of text of the first webpage;
Display module, for showing that the second webpage for meeting preset format, second webpage include first webpage Non-textual content and content of text.
On the one hand, the present invention provides a kind of device of display collection web page contents, described device includes:
Display module wraps in the collection entry of any webpage for showing the collection entry at least one webpage collected Include the web page address of any webpage;
Module is obtained, for the web page address according to the first selected webpage, obtains the webpage member of first webpage Element;
Determining module determines institute for the label according to the web page element from the web page contents of first webpage State the non-textual content and content of text of the first webpage;
The display module, is also used to show and meets the second webpage of preset format, and second webpage includes described the The non-textual content and content of text of one webpage.
On the one hand, the present invention provides a kind of terminal, the terminal includes processor and memory, is deposited in the memory Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute Instruction set is stated to be loaded by the processor and executed to realize behaviour performed in the method for showing web page contents described above Make.
On the one hand, the present invention provides a kind of terminal, the terminal includes processor and memory, is deposited in the memory Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute Instruction set is stated to be loaded by the processor and executed to realize that it is performed in the method for web page contents that display described above is collected Operation.
On the one hand, it the present invention provides a kind of computer readable storage medium, is deposited in the computer readable storage medium Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute Instruction set is stated to be loaded by processor and executed to realize operation performed in the method for showing web page contents described above.
On the one hand, it the present invention provides a kind of computer readable storage medium, is deposited in the computer readable storage medium Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute Instruction set is stated to be loaded by processor and executed to realize that behaviour performed in the method for web page contents is collected in display described above Make.
In the method that web page contents are shown provided by the embodiment of the present invention, the webpage of the first webpage to be shown is obtained Element;According to the label of the web page element, the non-textual content and text of the first webpage are determined from the web page contents of the first webpage This content.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, in display second When webpage, the content of text of the first webpage can not only be shown, additionally it is possible to the non-textual content for showing the first webpage, to avoid It filters out the problem that typesetting context difference is larger caused by non-textual content, improves accuracy.
Detailed description of the invention
Figure 1A is a kind of schematic diagram of implementation environment provided in an embodiment of the present invention;
Figure 1B is the schematic diagram that interface is shared in a kind of display provided in an embodiment of the present invention;
Fig. 1 C is a kind of schematic diagram of web page address for collecting webpage provided in an embodiment of the present invention;
Fig. 1 D is a kind of schematic diagram of display reminding information provided in an embodiment of the present invention;
Fig. 2A is a kind of method flow diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 2 B is a kind of schematic diagram at display collection interface provided in an embodiment of the present invention;
Fig. 2 C is a kind of schematic diagram of web page contents for showing the first webpage provided in an embodiment of the present invention;
Fig. 3 is a kind of method flow diagram of display collection web page contents provided in an embodiment of the present invention;
Fig. 4 A is a kind of apparatus structure schematic diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 4 B is a kind of structural schematic diagram of determining module provided in an embodiment of the present invention;
Fig. 4 C is a kind of apparatus structure schematic diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 4 D is a kind of structural schematic diagram of display module provided in an embodiment of the present invention;
Fig. 5 is a kind of apparatus structure schematic diagram of display collection web page contents provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.
The embodiment of the invention provides a kind of implementation environments, include terminal 101 and resource in the implementation environment referring to Figure 1A Server 102;Terminal 101 can for mobile phone terminal, PAD (Portable Android Device, tablet computer) terminal or The equipment that any installation such as computer terminal has the application of collection function.The terminal 101 can access resource clothes by network Business device 102, to obtain service provided by Resource Server 102, which can be web content service.For example, news Service, public platform service etc..
The terminal 101 can realize the access of Resource Server 102 by any application for being installed in terminal 101, And when accessing to web page contents provided by Resource Server 102, which can be by with collection function The content accessed is collected in any application, in case can be by the collection of the collection content when needing to consult next time It is quickly had access to and is used.Collection refers to the address of web page contents to be collected and/or content-data storage is objective to application On the corresponding server in family end.Wherein, the application with any application of collection function with above-mentioned carry out web page contents access It can be the same application, for example, when showing web page contents in applications client, it can be by the applications client Collection function triggering, it is to realize that the web page address of web page contents and/or content-data storage is corresponding to applications client Server on, to complete collection to the web page contents.Certainly, there is any application of collection function to carry out net with above-mentioned for this The application of page access to content can be different application, for example, the money that ought be read in its first applications client by terminal 101 It, can be by such as collection function of the shortcut function in the first applications client when the contents such as news, webpage, public platform article It triggers to call or trigger the interface of another applications client (the second applications client) with collection function, thus two Information or data transmission communication/communication link is established between a applications client (the first applications client and the second applications client) It connects, or sends a default trigger signal to another applications client (the second applications client) with collection function, or A preset signals are generated by the triggering of the shortcut function with (second answers by another applications client with collection function With client) it captures or receives.It is corresponding, another applications client (the second applications client) with collection function Communication/communication connection can be established by the interface and the first applications client;Or second applications client capture, search or The preset signals that the first applications client is triggered or generated by its shortcut function are received, to what is shown in the first applications client The contents such as information, webpage, public platform carry out collection processing.It is of course also possible to be, the first applications client is by the net of web page contents Page address and/or content-data pass to the another application client (the second applications client), to store to the second application On the corresponding server of client, to complete the collection to the web page contents.How the embodiment of the present invention is specifically realized to above-mentioned The specific manifestation of collection function and application is defined.Also, when terminal completes the collection to the web page contents, display reminding Message, which, which is used to indicate, has collected success.
When the second applications client of terminal 101 collects the web page contents shown in the first applications client, terminal 101 The first applications client of front stage operation, and the first applications client shows the web page contents of some webpage, display point in the webpage Enjoy button.When user wants to collect the web page contents, user can click the sharing button.Terminal 101 detects the sharing When button is triggered, at least one sharing interface is shown;It include calling the second applications client at least one sharing interface Interface is collected, which can be replication link or sharing link etc..When user wants to collect the webpage, user can be with Click the collection interface.When terminal 101 detects that the collection interface is triggered, in the second applications client of front stage operation, and will On the web page address and/or content-data storage to the corresponding server of the second applications client of the webpage.
For example, with reference to Figure 1B, show that user passes through one pass of terminal 101 (being mobile phone terminal in the present embodiment) reading When the web page contents of " understand for three minutes: passing by from the whole world of product manager ", want to carry out when user is interested in the content When collection, it can be triggered by triggering shortcut function and share instruction to terminal 101.Terminal 101 detects the sharing of user's triggering When instruction, at least one sharing interface is shown, it includes collection interface that at least one, which shares mode, and the collection interface is for collecting this Webpage.For example, at least one sharing interface include: generate picture share, be shared with by social application good friend, replication link and Share in information display platform.Wherein, replication link is collection interface.When user wants to collect the web page contents, user Click replication link.When terminal 101 detects that collection interface is triggered, the web page address and/or content-data of the webpage are obtained, And on the web page address and/or content-data storage to the corresponding server of applications client of the webpage, to complete to the webpage The collection of content, referring to Fig. 1 C.Also, when terminal 101 completes the collection to the web page contents, 101 display reminding message of terminal, The prompt information can be " collecting successfully ", referring to Fig. 1 D.
In terminal 101 by the web page address of web page contents and/or content-data storage to the corresponding service of applications client When on device, terminal 101 generates the collection entry of the webpage, and the web page address of the webpage, the receipts are included at least in the collection entry Hiding can also include summary info, web page title and/or the application identities of source application of the webpage of the webpage etc. in entry Information.Wherein, application identities can be Apply Names etc..
It should be noted that the web page contents collected by any application can by applications client handle to show, It can also be returned to after treatment by server using visitor by applications client by being interacted between corresponding server Family end is shown.It is saved in addition, the information such as web page contents of above-mentioned collection can be based on user by server, thus After allowing the same user to log in applications client by any one terminal 101, browsed by using the collection function The information collected in any terminal 101.
The embodiment of the invention provides a kind of methods for showing web page contents, and the executing subject of this method can be terminal. A referring to fig. 2, this method comprises:
201, terminal obtains the web page element of the first webpage to be shown.
First webpage is the webpage that terminal has been collected, and the embodiment of the present invention is only to look into the web page contents collected It is illustrated for seeing, and the source application of webpage has been collected without limitation to this.Terminal can be stored in advance this The web page address or web page contents of webpage are collected, if to be stored with web page address, it can be further according to web page address The web page element of the first webpage is obtained, to avoid the excessive occupancy to terminal storage space, and if storage is web page contents, The web page element of the first webpage can be then obtained with web page contents according to the pre-stored data.
Wherein, in the embodiment for being stored with web page address, terminal is obtained to be shown by following steps (1) to (3) The first webpage web page element, the web page element of the first webpage can be webpage HTML (HyperText Markup Language, HyperText Markup Language) element.
(1) terminal checks instruction according to this, with showing the webpage of at least one webpage of collection in response to checking instruction Location.
It include checking button in the main interface for the web page contents that terminal is currently shown;When user wants to check some webpage When web page contents, user can check button by triggering this to check instruction to terminal triggering.Terminal detect this check by When button is triggered, instruction is checked in response to this, and instruction is checked according to this, shows the web page address of at least one webpage of collection.
When the web page address of at least one webpage of terminal display collection, terminal can show the collection of at least one webpage Entry includes at least the web page address of the webpage in the collection entry of each webpage, can also include the webpage summary info, The information such as the application identities of the source of web page title and/or webpage application.User can be according at least one webpage of display Collection entry, the web page address of webpage to be shown, and the net that will be chosen are chosen in the web page address of at least one webpage Page address submits to terminal, executes step (2).
For example, with reference to Fig. 2 B, terminal shows the web page address for two webpages collected, respectively " it understands within three minutes: from The whole world of product manager passes by " and " HTML_ Baidupedia ".
(2) terminal obtains the web page address of the first selected webpage from the web page address of multiple webpages.
For example, user selects " to understand within three minutes: from product manager from the web page address for two webpages that terminal has been collected The whole world pass by ", terminal obtain " understand within three minutes: being passed by from the whole world of product manager " web page address.
(3) terminal obtains the web page element of the first webpage from the web page address of the first webpage.
Terminal loads the web page address of the first webpage by WebView (network view), to access coming for first webpage Source server obtains the web page element of the first webpage from the web page address of the first webpage.The web page element of first webpage can be with For the source code of the first webpage.
In embodiments of the present invention, after terminal gets the web page element of the first webpage, terminal can directly by with Lower step 202-205 determines the non-textual of the first webpage according to the label of the web page element from the web page contents of the first webpage Then content and content of text carry out typesetting to non-textual content and content of text.Terminal can also be by the webpage of the first webpage Non-content element in element is deleted, then just determined from the web page contents of the first webpage the first webpage non-textual content and Content of text.
The step of terminal deletes the non-content element in the web page element of the first webpage can be with are as follows: terminal is in webpage member Non-content web page element is determined in element, and non-content web page element is deleted in the web page element, obtains content page element.So It is based on the content page element afterwards, executes step 202.
Wherein, non-content web page element can be the corresponding web page element of third class label, and third class label includes pattern Label, genre labels and/or script tag.Correspondingly, terminal determines the step of non-content web page element in the web page element It can be with are as follows: terminal determines the corresponding web page element of third class label, by third according to the label of web page element in web page element The corresponding web page element of class label is as non-content web page element.
For example, third class label includes style tags, genre labels and script tag.Style tags can be CSS (Casading Style Sheets, cascading style sheets), genre labels can be style label, and script tag can be Script label.Terminal determines the corresponding web page element of CSS label, style according to the label of web page element in web page element The corresponding web page element of label, the corresponding web page element of script label, by the corresponding web page element of CSS label, style label Corresponding web page element and the corresponding web page element of script label form non-content web page element.
In embodiments of the present invention, terminal deletes the non-content web page element in the web page element of the first webpage, thus When subsequent non-textual content determining in the web page contents of the first webpage and content of text, not by these non-content web page elements The influence of (such as pattern, style and/or script), improves the accuracy of typesetting.
202, terminal constructs the topological structure of the first webpage, each of topological structure according to the web page element of the first webpage Node element corresponds to a web page element.
Terminal determines the hierarchical relationship between each web page element of the first webpage, is closed according to the level between each webpage System constructs the topological structure of the first webpage.Wherein, topological structure can be DOM (Document Object Model, document pair As changing model) tree (tree).DOM tree, which refers to, is parsed the web page element of the first webpage by DOM, and generates webpage The tree structure of element, the corresponding web page element of a node element of a tree structure, and the node of the node element The web page contents of the web page element are stored in label.
In a possible implementation, for terminal before the topological structure for constructing the first webpage, terminal can not also Non-content element in the web page element of first webpage is deleted, and in the web page element according to the first webpage, construct the first net When the topological structure of page, ignore the non-content element of the first webpage, and constructs topology knot according only to the content element of the first webpage Structure.
In embodiments of the present invention, terminal constructs the topological structure of the first webpage, thus after being carried out based on the topological structure Continuous processing, to improve subsequent typesetting efficiency.
203, terminal determines that the first node element, the first node element are the element of non-textual content in the topological structure Node.
It include label, and the label of the label of content of text and non-textual content in the corresponding web page element of node element It is different.Terminal can determine the first node element by the label of web page element in the topological structure.Correspondingly, this step It can be with are as follows: terminal determines the corresponding node element of first kind label in the topological structure, and/or, in the topological structure really The fixed corresponding node element of second class label, first kind label include quoting label, form tag and/or code block label, the Two class labels include customized label;By the corresponding node element of first kind label and/or the corresponding element section of the second class label Point is used as the first node element.
Terminal traverses the corresponding web page element of each node element in the topological structure, corresponding in each node element It determines that label is the web page element of first kind label in web page element, the corresponding node element of the web page element determined is determined For the corresponding node element of first kind label.Equally, terminal traverses the corresponding webpage of each node element in the topological structure Element determines that label is the web page element of the second class label in the corresponding web page element of each node element, by what is determined The corresponding node element of web page element is determined as the corresponding node element of the second class label.
For example, first kind label includes reference label, form tag and code block label.Quote label be < Blockquote>, form tag be<table>, code block label be<code>with<pre>deng.Terminal is in the topological structure Determine that label is<blockquote>,<table>,<code>with<pre>node element, using the node element determined as First node element.Second class label includes customized label, for example, audio tag, video tab, picture tag etc..
Since the coding mode of different application programs is different, different application can correspond to the different first kind Label and the second class label, before this step, for terminal it needs to be determined that first kind label and/or the second class label, which can With are as follows: terminal obtains the application identities of the source application of the first webpage;According to the application identities, determination is corresponding with the application identities First kind label and/or the second class label.
Terminal is when collecting the first webpage, the collection entry of the first webpage of terminal storage, includes first in the collection entry The application identities of the source application of webpage.Therefore, terminal directly obtains coming for the first webpage from the collection entry of the first webpage The application identities of source application.
Different application programs corresponds to different first kind labels.Before this step, terminal obtains multiple source applications Application identities and each application first kind label, and store the application identities of each source application and pair of first kind label It should be related to.Correspondingly, terminal the step of according to the application identities, determining first kind label corresponding with the application identities, can be with Are as follows: terminal is corresponding with the application identities from obtaining in the corresponding relationship of application identities and first kind label according to the application identities First kind label.
Equally, different application programs corresponds to the second different class labels.Before this step, terminal obtains multiple sources Second class label of the application identities of application and each other application, and store the application identities and the second class of each source application The corresponding relationship of label.Correspondingly, terminal determines the step of the second class label corresponding with the application identities according to the application identities It suddenly can be with are as follows: terminal is marked from obtaining in the corresponding relationship of application identities and the second class label with the application according to the application identities Know corresponding second class label.
In embodiments of the present invention, after terminal determines the first node element in the topological structure, the first net is stored The web page address of page and the corresponding relationship of the first node element, in order to which subsequent terminal shows the web page contents of the first webpage again When, directly according to the web page address of the first webpage, from the corresponding relationship of the web page address of the first webpage and the first node element The first node element for obtaining the first webpage does not need to carry out above-mentioned identification process again, to improve recognition efficiency, in turn Improve subsequent typesetting efficiency.
204, terminal determines that second element node, second element node are the element section of content of text in the topological structure Point.
It include label, and the label of the label of content of text and non-textual content in the corresponding web page element of node element It is different.Terminal can determine second element node by the label of web page element in the topological structure.Correspondingly, this step It can be with are as follows: terminal determines the corresponding node element of the 4th class label in the topological structure, by the corresponding element of the 4th class label Node is as second element node.4th class label is text label.
For example, the 4th class label includes text label, text label is<class>etc..Terminal is true in the topological structure Calibration label are the node element of<class>, using the node element determined as second element node.
Since the coding mode of different application programs is different, different application can correspond to the 4th different classes Label, terminal can also obtain the source of the first webpage in the way of above-mentioned acquisition first kind label and/or the second class label The corresponding 4th class label of the application identities of application.
Due to including the first node element and second element node in the topological structure.First yuan is had confirmed in terminal In the case where plain node, terminal can be using the other elements node in the topological structure in addition to the first node element as second Node element.
Equally, after terminal determines second element node in the topological structure, the web page address of the first webpage is stored With the corresponding relationship of second element node, when showing the web page contents of the first webpage again in order to subsequent terminal, direct basis The web page address of first webpage, from obtaining the first webpage in the corresponding relationship of the web page address of the first webpage and second element node Second element node, do not need to carry out above-mentioned identification process again, to improve recognition efficiency, and then improve subsequent row Version efficiency.
It should be noted that step 203 and 204 not stringent chronological orders, can first carry out step 203, then Execute step 204;Step 204 can also be first carried out, then executes step 203;Step can also be performed simultaneously by two processes 203 and 204, it is not limited in the embodiment of the present invention.
205, terminal obtains non-textual content from the node label of the first node element, from the node of second element node Content of text is obtained in label.
It include web page contents in the node label of node element, terminal obtains non-from the node label of the first node element Content of text obtains content of text from the node label of second element node, executes step 206 to non-textual content and text Content carries out typesetting.
206, terminal shows the second webpage for meeting preset format, the second webpage include the first webpage non-textual content and Content of text.
Terminal directly can carry out typesetting, namely the first following implementation to non-textual content and content of text;Eventually End can also identify body matter from non-textual content and content of text, only carry out typesetting to body matter, namely following Second of implementation.
For the first implementation, this step can be with are as follows: non-textual content and content of text are formed the second net by terminal The web page contents of page, display meet the web page contents of the second webpage of the first preset format.
First preset format includes the first non-textual content display format and the first content of text display format, the first text Content display format includes the first paragraph format and/or the first font format.First non-textual content display format includes first Quote display format, the first Tabular display format and/or first generation code block display format.
In embodiments of the present invention, terminal recognition goes out after non-textual content and content of text, directly to content of text and Non-textual content carries out typesetting, improves the efficiency of typesetting.Also, due to not filtering out non-textual content, to improve The readability of the web page contents of second webpage.In addition, terminal to the second webpage web page contents carry out typesetting when, will not be to original Text is edited, and is not changed the original text meaning, is respected fully original text author.
For example, the figure A in Fig. 2 C is the web page contents of the first webpage before typesetting, the font of the web page contents of the first webpage Format is the Song typeface, and the font size of title and the font size of recommendation information are No. 14, and the font size of body matter is 12 Number.Terminal is adjusted the font format and size of the web page contents of the first webpage, by the word of the web page contents of the first webpage Physique formula is modified for regular script, and sets No. 8 for the font size of recommendation information, obtains the web page contents of the second webpage, joins See the figure B in Fig. 2 C.
For second of implementation, this step can be realized by following steps (1) to (3), comprising:
(1) the non-textual content and content of text are formed the web page contents of the second webpage by terminal.
(2) terminal identifies body matter from the web page contents of the second webpage.
Terminal can identify the body matter in the web page contents of the second webpage by the first default regular expression, the The one default regular expression body matter in web page contents for identification;Namely following first way;Terminal can also lead to It crosses the second default regular expression and identifies that body matter in the web page contents of the second webpage, the second default regular expression are used Non- body matter namely the following second way in identification web page contents;Terminal can also be according to where body matter The body matter in the web page contents of the second webpage, namely the third following mode are identified in region.Terminal can also be according to member The label of plain node identifies the body matter in the web page contents of the second webpage namely following 4th kind of mode.
First way, step (2) can be with are as follows: terminal is by the first default regular expression, from the webpage of the second webpage Body matter is identified in content.
Terminal identifies the second specified element section by the first default regular expression from the node element of the second webpage Point;The corresponding second node content of the specified node element of determination second from the web page contents of the second webpage, will be in second node Hold and is used as body matter.
It include at least one first label in first default regular expression, wherein the first label is corresponding for body matter Label.Correspondingly, terminal identifies second from each node element of the second webpage by the first default regular expression The step of specified node element, can be with are as follows: terminal traverses each element section of the second webpage by the first default regular expression The label of point, determines the second specified node element of label and the first default regular expression matching.
For example, the first label is label A, label B and label C;Then the first default regular expression can be label A or Label B or label C.
The second way, step (2) can be with are as follows: terminal is by the second default regular expression, from the webpage of the second webpage Non- body matter is identified in content, by the web page contents of the second webpage unless the content except body matter is determined as text Content.Wherein, the second default regular expression non-body matter in web page contents for identification.Non- body matter includes recommending Link (such as advertisement), etc..
Second default regular expression includes at least one second label;Terminal passes through the second default regular expression, from Identify that third specifies node element in the node element of second webpage;Determine that third is specified from the web page contents of the second webpage The corresponding third node content of node element, using third node content as non-body matter.Wherein, the second label is non-text The corresponding label of content.Correspondingly, terminal is identified from the node element of the second webpage by the second default regular expression Third specifies the step of node element can be with are as follows: terminal traverses each member of the second webpage by the second default regular expression The label of plain node determines that the third of label and the second default regular expression matching specifies node element.
In a possible implementation, the second default regular expression can also include at least one keyword, often A keyword is the corresponding keyword of non-body matter.For example, keyword can be " guessing that you like ", " purchase " etc..Correspondingly, For terminal by the second default regular expression, the step of identifying non-body matter from the web page contents of the second webpage, can be with Are as follows: the web page contents of the second webpage are divided into multiple content blocks by terminal, determine each content blocks and the second default regular expressions Matching degree between formula;According to the matching degree between each content blocks and the second default regular expression, from multiple content blocks Selecting matching degree is more than the content blocks of preset threshold, and the content blocks that matching degree is more than preset threshold are determined as non-body matter. Wherein, terminal can be using a paragraph in web page contents as a content blocks.
Preset threshold, which can according to need, to be configured and changes, and is not made to preset threshold in embodiments of the present invention specific It limits.For example, preset threshold can be 80% or 85% etc..
The third mode, since body matter is normally at the middle region of webpage, terminal can according to it is preset just Body matter is identified from the web page contents of the second webpage in literary region.Correspondingly, step (2) can be with are as follows: terminal is Specified region is determined in two webpages, using the web page contents in specified region as body matter.
Since the webpage layout of different application is different, correspondingly, the step of terminal determines specified region in the second webpage It can be with are as follows: terminal obtains the application identities of the source application of the first webpage;According to application identities determination and the application identities pair The specified region answered determines the specified region in the second webpage.
4th kind of mode, step (2) can be with are as follows: terminal determines the weight of each node element of the second webpage, according to every The weight of a node element determines the first specified node element, and the first specified element is determined from the web page contents of the second webpage The corresponding first node content of node, using first node content as body matter.
In embodiments of the present invention, terminal can determine the power of node element in conjunction with the label and node content of node element Weight.Correspondingly, terminal determines that the step of weight of each node element of the second webpage can be by following steps (2-1) extremely (2-4) is realized, comprising:
(2-1) terminal determines the tag types of each node element and the corresponding node content of each node element includes Number of words.
(2-2) terminal determines the first weight of each node element according to the tag types of each node element.
The corresponding relationship of each tag types and weight is stored in terminal;Correspondingly, this step can be with are as follows: terminal is according to every The tag types of a node element, from the first weight for obtaining each node element in the corresponding relationship of tag types and weight.
The number of words that (2-3) terminal includes according to the corresponding node content of each node element, determines each node element Second weight.
The corresponding relationship of number of words and weight is stored in terminal;Correspondingly, this step can be with are as follows: terminal is according to each element section The number of words that the corresponding node content of point includes, from the second power for obtaining each node element in the corresponding relationship of number of words and weight Weight.
In embodiments of the present invention, the corresponding relationship of terminal storage number of words and weight, terminal are corresponding according to each element The number of words that node content includes, from the second weight for obtaining each node element in the corresponding relationship of number of words and weight, to mention The accuracy of second weight of the high each node element determined.
The corresponding relationship of number of words range and weight can also be stored in terminal;Correspondingly, this step can be with are as follows: terminal according to The number of words and stored number of words range that the corresponding node content of each node element includes, determine that each node element is corresponding Number of words range where the number of words that node content includes, according to the corresponding number of words range of each node element, from number of words range and The second weight of each node element is obtained in the corresponding relationship of weight.
In embodiments of the present invention, the corresponding relationship of number of words range and weight is stored in terminal, terminal is according to each element The corresponding relationship of number of words and the number of words range and weight that the corresponding node content of node includes, determines each node element Second weight.To not need to store the corresponding relationship of each number of words and weight in terminal, memory space is saved.
(2-4) terminal determines the weight of each node element according to the first weight and the second weight of each node element.
For each node element, terminal determines the first coefficient of the first weight and the second coefficient of the second weight, determines First weight of the node element and the product of the first coefficient obtain the first numerical value, determine the second weight and of the node element The product of two coefficients obtains second value, the weight by the sum of the first numerical value and second value as the node element.
(3) terminal shows the body matter for meeting the second webpage of the second preset format.
Second preset format and the first preset format can be identical, can not also be identical.Also, the second preset format includes Second non-textual content display format and the second content of text display format, the second content of text display format include the second paragraph Format and/or the second font format.Second non-textual content display format is shown including the second reference display format, the second table Format and/or second generation code block display format.
For example, with reference to the figure C in Fig. 2 C, terminal filters out the non-body matter in the figure B in Fig. 2 C, only display the The body matter of two web page contents.
In embodiments of the present invention, terminal identifies body matter from the web page contents of the second webpage, to filter out The non-body matter such as ad content and/or recommendation, bothers user so as to avoid non-Chinese content, improves user Viscosity.Also, typesetting is carried out to web page contents by terminal, to alleviate the concurrent capability of server.
After terminal shows the second webpage, the net of the web page address of the first webpage of terminal storage and the second webpage after typesetting Page content, when showing the second webpage again in order to subsequent terminal, directly according to the web page address of the first webpage, from the first webpage Web page address and the second webpage web page contents corresponding relationship in obtain the web page contents of the second webpage, show the second webpage Web page contents, do not need to carry out above-mentioned typesetting process again, improve typesetting efficiency.
In the method that web page contents are shown provided by the embodiment of the present invention, terminal obtains the first webpage to be shown Web page element;According to the label of the web page element, the non-textual content of the first webpage is determined from the web page contents of the first webpage And content of text.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, it is showing When the second webpage, the content of text of the first webpage can not only be shown, additionally it is possible to show the non-textual content of the first webpage, thus The problem that typesetting context difference is larger caused by filtering out non-textual content is avoided, accuracy is improved.
The embodiment of the invention provides a kind of method of display collection web page contents, the executing subject of this method is terminal, Referring to Fig. 3, this method comprises:
301, terminal shows the collection entry of at least one webpage of collection, includes any in the collection entry of any webpage The web page address of webpage.
It include checking button in the main interface for the web page contents that terminal is currently shown;When user wants to check some webpage When web page contents, user can check button by triggering this to check instruction to terminal triggering.Terminal detect this check by When button is triggered, instruction is checked in response to this, instruction is checked according to this, shows the collection entry of at least one webpage, each net The web page address of the webpage is included at least in the collection entry of page, can also including the summary info of the webpage, web page title and/ Or the information such as application identities of source application of the webpage.User can according to the collection entry of at least one webpage of display, The web page address of webpage to be shown is chosen in the web page address of at least one webpage, and the web page address chosen is submitted to Terminal.
302, terminal obtains the web page element of the first webpage according to the web page address of the first selected webpage.
This step is identical with the process of the web page element of the first webpage of acquisition in step 201, and details are not described herein.
303, terminal determines the non-textual of the first webpage according to the label of web page element from the web page contents of the first webpage Content and content of text.
This step can realize that details are not described herein by above step 202-205.
304, terminal shows the second webpage for meeting preset format, the second webpage include the first webpage non-textual content and Content of text.
This step and step 206 are identical, and details are not described herein.
In the method for showing collection web page contents provided by the embodiment of the present invention, terminal display collection at least one The collection entry of webpage includes the web page address of any webpage in the collection entry of any webpage.When user wants to read some When the web page contents of webpage, user can click the web page address of some webpage.Terminal is according to the net of the first selected webpage Page address obtains the web page element of the first webpage;According to the label of the web page element, determined from the web page contents of the first webpage The non-textual content and content of text of first webpage.Due to that can determine content of text and non-from the web page contents of the first webpage Therefore content of text when showing the second webpage, can not only show the content of text of the first webpage, additionally it is possible to show first The non-textual content of webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out, Improve accuracy.
The embodiment of the invention provides a kind of device for showing web page contents, which is applied in the terminal, for executing The step of terminal in the method for above-mentioned display web page contents executes.A referring to fig. 4, the device include:
Module 401 is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module 402 determines the first net for the label according to the web page element from the web page contents of the first webpage The non-textual content and content of text of page;
Display module 403 shows the second webpage for carrying out typesetting to the non-textual content and text content.
In a possible implementation, B referring to fig. 4, determining module 402, comprising:
Construction unit 4021, for according to the web page element, constructing the topological structure of the first webpage, the topological structure it is every A node element corresponds to a web page element;
Determination unit 4022, for determining that the first node element, the first node element are non-textual in the topological structure The node element of content;
Determination unit 4022 is also used to determine that second element node, second element node are text in the topological structure The node element of content;
Acquiring unit 4023, for obtaining the non-textual content from the node label of the first node element, from second yuan Text content is obtained in the node label of plain node.
In a possible implementation, determination unit 4022 is also used to determine the first category in the topological structure Corresponding node element is signed, and/or, the corresponding node element of the second class label, first kind label are determined in the topological structure Including reference label, form tag and/or code block label, the second class label includes customized label;By first kind label pair The node element and/or the corresponding node element of the second class label answered are as the first node element.
In a possible implementation, determination unit 4022 is also used to obtain answering for the source application of the first webpage With mark;According to the application identities, first kind label corresponding with the application identities and/or the second class label are determined.
In a possible implementation, C referring to fig. 4, the device further include:
Removing module 404 will be non-interior in the web page element for determining non-content web page element in the web page element Hold web page element to delete.
In a possible implementation, D referring to fig. 4, display module 403, comprising:
Component units 4031, for the non-textual content and text content to be formed to the web page contents of second webpage;
Recognition unit 4032, for identifying body matter from the web page contents of second webpage;
Display unit 4033, for showing that the second webpage for meeting preset format, the second webpage include the non-of the first webpage Content of text and content of text.
In a possible implementation, recognition unit 4032 is also used to by presetting regular expression, from the second net The body matter is identified in the web page contents of page, the default regular expression body matter in web page contents for identification; And/or
Recognition unit 4032 is also used to determine the weight of each node element of the second webpage, according to each node element Weight, determine the first specified node element, from the web page contents of the second webpage determine the first specified node element it is corresponding First node content, using first node content as body matter.
In a possible implementation, recognition unit 4032 is also used to determine the tag types of each node element The number of words that node content corresponding with each node element includes;According to the tag types of each node element, each member is determined First weight of plain node;According to the number of words that the corresponding node content of each node element includes, each node element is determined Second weight;According to the first weight and the second weight of each node element, the weight of each node element is determined.
In a possible implementation, recognition unit 4032 is also used to by presetting regular expression, from the second net The second specified node element is identified in the node element of page;The second specified element section is determined from the web page contents of the second webpage The corresponding second node content of point, using second node content as body matter.
In a possible implementation, module 401 is obtained, is also used to check finger according to this in response to checking instruction It enables, shows the web page address of at least one webpage of collection;From the web page address of at least one webpage, selected is obtained The web page address of one webpage;From the web page address of the first webpage, the web page element of the first webpage is obtained.
In the method that web page contents are shown provided by the embodiment of the present invention, the webpage of the first webpage to be shown is obtained Element;According to the label of the web page element, the non-textual content and text of the first webpage are determined from the web page contents of the first webpage This content.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, in display second When webpage, the content of text of the first webpage can not only be shown, additionally it is possible to the non-textual content for showing the first webpage, to avoid It filters out the problem that typesetting context difference is larger caused by non-textual content, improves accuracy.
The embodiment of the invention provides a kind of devices of display collection web page contents, and referring to Fig. 5, which includes:
Display module 501, the collection entry of at least one webpage for showing collection, in the collection entry of any webpage Web page address including any webpage;
Module 502 is obtained, for the web page address according to the first selected webpage, obtains the webpage member of the first webpage Element;
Determining module 503 determines the first webpage for the label according to web page element from the web page contents of the first webpage Non-textual content and content of text;
Display module 501 is also used to the second webpage that display meets preset format, and the second webpage includes the non-of the first webpage Content of text and content of text.
In the method for showing collection web page contents provided by the embodiment of the present invention, terminal display collection at least one The collection entry of webpage includes the web page address of any webpage in the collection entry of any webpage.When user wants to read some When the web page contents of webpage, user can click the web page address of some webpage.Terminal is according to the net of the first selected webpage Page address obtains the web page element of the first webpage;According to the label of the web page element, determined from the web page contents of the first webpage The non-textual content and content of text of first webpage.Due to that can determine content of text and non-from the web page contents of the first webpage Therefore content of text when showing the second webpage, can not only show the content of text of the first webpage, additionally it is possible to show first The non-textual content of webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out, Improve accuracy.
It should be understood that it is provided by the above embodiment display web page contents device when showing web page contents, only with The division progress of above-mentioned each functional module can according to need and for example, in practical application by above-mentioned function distribution by not Same functional module is completed, i.e., the internal structure of device is divided into different functional modules, to complete whole described above Or partial function.In addition, the device of display web page contents provided by the above embodiment and the method for display web page contents are implemented Example belongs to same design, and specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Fig. 6 shows the structural block diagram of the terminal 600 of an illustrative embodiment of the invention offer.The terminal 600 can be with Be: smart phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III, Dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, laptop or desktop computer.Terminal 600 be also possible to by Referred to as other titles such as user equipment, portable terminal, laptop terminal, terminal console.
In general, terminal 600 includes: processor 601 and memory 602.
Processor 601 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place Reason device 601 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field- Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed Logic array) at least one of example, in hardware realize.Processor 601 also may include primary processor and coprocessor, master Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.? In some embodiments, processor 601 can be integrated with GPU (Graphics Processing Unit, image processor), GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 601 can also be wrapped AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning Calculating operation.
Memory 602 may include one or more computer readable storage mediums, which can To be non-transient.Memory 602 may also include high-speed random access memory and nonvolatile memory, such as one Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 602 can Storage medium is read for storing at least one instruction, at least one instruction performed by processor 601 for realizing this Shen Please in embodiment of the method provide display web page contents method.
In some embodiments, terminal 600 is also optional includes: peripheral device interface 603 and at least one peripheral equipment. It can be connected by bus or signal wire between processor 601, memory 602 and peripheral device interface 603.Each peripheral equipment It can be connected by bus, signal wire or circuit board with peripheral device interface 603.Specifically, peripheral equipment includes: radio circuit 604, at least one of touch display screen 605, camera 606, voicefrequency circuit 607, positioning component 608 and power supply 609.
Peripheral device interface 603 can be used for I/O (Input/Output, input/output) is relevant outside at least one Peripheral equipment is connected to processor 601 and memory 602.In some embodiments, processor 601, memory 602 and peripheral equipment Interface 603 is integrated on same chip or circuit board;In some other embodiments, processor 601, memory 602 and outer Any one or two in peripheral equipment interface 603 can realize on individual chip or circuit board, the present embodiment to this not It is limited.
Radio circuit 604 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetrates Frequency circuit 604 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 604 turns electric signal It is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 604 wraps It includes: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, codec chip Group, user identity module card etc..Radio circuit 604 can be carried out by least one wireless communication protocol with other terminals Communication.The wireless communication protocol includes but is not limited to: WWW, Metropolitan Area Network (MAN), Intranet, each third generation mobile communication network (2G, 3G, 4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, it penetrates Frequency circuit 604 can also include NFC (Near Field Communication, wireless near field communication) related circuit, this Application is not limited this.
Display screen 605 is for showing UI (User Interface, user interface).The UI may include figure, text, figure Mark, video and its their any combination.When display screen 605 is touch display screen, display screen 605 also there is acquisition to show The ability of the touch signal on the surface or surface of screen 605.The touch signal can be used as control signal and be input to processor 601 are handled.At this point, display screen 605 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/or Soft keyboard.In some embodiments, display screen 605 can be one, and the front panel of terminal 600 is arranged;In other embodiments In, display screen 605 can be at least two, be separately positioned on the different surfaces of terminal 600 or in foldover design;In still other reality It applies in example, display screen 605 can be flexible display screen, be arranged on the curved surface of terminal 600 or on fold plane.Even, it shows Display screen 605 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 605 can use LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) Etc. materials preparation.
CCD camera assembly 606 is for acquiring image or video.Optionally, CCD camera assembly 606 include front camera and Rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.One In a little embodiments, rear camera at least two is main camera, depth of field camera, wide-angle camera, focal length camera shooting respectively Any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide-angle Camera fusion realizes that pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are clapped Camera shooting function.In some embodiments, CCD camera assembly 606 can also include flash lamp.Flash lamp can be monochromatic warm flash lamp, It is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for not With the light compensation under colour temperature.
Voicefrequency circuit 607 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and will Sound wave, which is converted to electric signal and is input to processor 601, to be handled, or is input to radio circuit 604 to realize voice communication. For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 600 to be multiple.Mike Wind can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 601 or radio circuit will to be come from 604 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramic loudspeaker.When When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunications Number the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 607 can also include Earphone jack.
Positioning component 608 is used for the current geographic position of positioning terminal 600, to realize navigation or LBS (Location Based Service, location based service).Positioning component 608 can be the GPS (Global based on the U.S. Positioning System, global positioning system), China dipper system or Russia Galileo system positioning group Part.
Power supply 609 is used to be powered for the various components in terminal 600.Power supply 609 can be alternating current, direct current, Disposable battery or rechargeable battery.When power supply 609 includes rechargeable battery, which can be wired charging electricity Pond or wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless The battery of coil charges.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 600 further includes having one or more sensors 610.The one or more sensors 610 include but is not limited to: acceleration transducer 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, Optical sensor 615 and proximity sensor 616.
The acceleration that acceleration transducer 611 can detecte in three reference axis of the coordinate system established with terminal 600 is big It is small.For example, acceleration transducer 611 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 601 can With the acceleration of gravity signal acquired according to acceleration transducer 611, touch display screen 605 is controlled with transverse views or longitudinal view Figure carries out the display of user interface.Acceleration transducer 611 can be also used for the acquisition of game or the exercise data of user.
Gyro sensor 612 can detecte body direction and the rotational angle of terminal 600, and gyro sensor 612 can To cooperate with acquisition user to act the 3D of terminal 600 with acceleration transducer 611.Processor 601 is according to gyro sensor 612 Following function may be implemented in the data of acquisition: when action induction (for example changing UI according to the tilt operation of user), shooting Image stabilization, game control and inertial navigation.
The lower layer of side frame and/or touch display screen 605 in terminal 600 can be set in pressure sensor 613.Work as pressure When the side frame of terminal 600 is arranged in sensor 613, user can detecte to the gripping signal of terminal 600, by processor 601 Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 613 acquires.When the setting of pressure sensor 613 exists When the lower layer of touch display screen 605, the pressure operation of touch display screen 605 is realized to UI circle according to user by processor 601 Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menu At least one of control.
Fingerprint sensor 614 is used to acquire the fingerprint of user, collected according to fingerprint sensor 614 by processor 601 The identity of fingerprint recognition user, alternatively, by fingerprint sensor 614 according to the identity of collected fingerprint recognition user.It is identifying When the identity of user is trusted identity out, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 601 Include solution lock screen, check encryption information, downloading software, payment and change setting etc..Terminal can be set in fingerprint sensor 614 600 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 600, fingerprint sensor 614 can be with It is integrated with physical button or manufacturer Logo.
Optical sensor 615 is for acquiring ambient light intensity.In one embodiment, processor 601 can be according to optics The ambient light intensity that sensor 615 acquires controls the display brightness of touch display screen 605.Specifically, when ambient light intensity is higher When, the display brightness of touch display screen 605 is turned up;When ambient light intensity is lower, the display for turning down touch display screen 605 is bright Degree.In another embodiment, the ambient light intensity that processor 601 can also be acquired according to optical sensor 615, dynamic adjust The acquisition parameters of CCD camera assembly 606.
Proximity sensor 616, also referred to as range sensor are generally arranged at the front panel of terminal 600.Proximity sensor 616 For acquiring the distance between the front of user Yu terminal 600.In one embodiment, when proximity sensor 616 detects use When family and the distance between the front of terminal 600 gradually become smaller, touch display screen 605 is controlled from bright screen state by processor 601 It is switched to breath screen state;When proximity sensor 616 detects user and the distance between the front of terminal 600 becomes larger, Touch display screen 605 is controlled by processor 601 and is switched to bright screen state from breath screen state.
It will be understood by those skilled in the art that the restriction of structure shown in Fig. 6 not structure paired terminal 600, can wrap It includes than illustrating more or fewer components, perhaps combine certain components or is arranged using different components.
The embodiment of the invention also provides a kind of computer readable storage medium, which is applied to Terminal is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the computer readable storage medium, should Instruction, the program, the code set or the instruction set are loaded by processor and are executed in the display webpage to realize above-described embodiment Operation performed by terminal in the method for appearance.
The embodiment of the invention also provides a kind of computer readable storage medium, which is applied to Terminal is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the computer readable storage medium, should Instruction, the program, the code set or the instruction set, which are loaded by processor and executed, collects net with the display for realizing above-described embodiment Operation performed by terminal in the method for page content.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (15)

1. a kind of method for showing web page contents, which is characterized in that the described method includes:
Obtain the web page element of the first webpage to be shown;
According to the label of the web page element, the non-textual of first webpage is determined from the web page contents of first webpage Content and content of text;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage Content.
2. the method according to claim 1, wherein the label according to the web page element, from described The non-textual content and content of text of first webpage are determined in the web page contents of one webpage, comprising:
According to the web page element, the topological structure of first webpage, each node element pair of the topological structure are constructed Ying Yuyi web page element;
Determine that the first node element, first node element are the node element of non-textual content in the topological structure;
Determine that second element node, the second element node are the node element of content of text in the topological structure;
The non-textual content is obtained from the node label of first node element, from the node of the second element node The content of text is obtained in label.
3. according to the method described in claim 2, it is characterized in that, described determine the first element section in the topological structure Point, comprising:
The corresponding node element of first kind label is determined in the topological structure, and/or, the is determined in the topological structure The corresponding node element of two class labels, the first kind label include reference label, form tag and/or code block label, institute Stating the second class label includes customized label;
Using the corresponding node element of the first kind label and/or the corresponding node element of the second class label as described One node element.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Obtain the application identities of the source application of first webpage;
According to the application identities, first kind label corresponding with the application identities and/or the second class label are determined.
5. the method according to claim 1, wherein the label according to the web page element, from described Before the non-textual content and the content of text that determine first webpage in the web page contents of one webpage, the method also includes:
Non-content web page element is determined in the web page element, deletes the non-content web page element in the web page element It removes.
6. described the method according to claim 1, wherein described show meets the second webpage of preset format Second webpage includes the non-textual content and content of text of first webpage, comprising:
The non-textual content and the content of text are formed to the web page contents of second webpage;
Body matter is identified from the web page contents of second webpage;
Display meets the body matter of second webpage of preset format.
7. according to the method described in claim 6, it is characterized in that, described identify from the web page contents of second webpage Body matter, comprising:
By default regular expression, the body matter is identified from the web page contents of second webpage, it is described default The regular expression body matter in web page contents for identification;And/or
The weight for determining each node element of second webpage determines first according to the weight of each node element Specified node element, the corresponding first node of the specified node element of determination described first from the web page contents of second webpage Content, using the first node content as the body matter.
8. the method according to the description of claim 7 is characterized in that each of the web page contents of the determination second webpage The weight of node element, comprising:
Determine the tag types of each node element and the number of words that the corresponding node content of each node element includes;
According to the tag types of each node element, the first weight of each node element is determined;
According to the number of words that the corresponding node content of each node element includes, the second power of each node element is determined Weight;
According to the first weight and the second weight of each node element, the weight of each node element is determined.
9. the method according to the description of claim 7 is characterized in that it is described by preset regular expression, from second net The body matter is identified in the web page contents of page, comprising:
By presetting regular expression, the second specified node element is identified from the node element of second webpage;
The corresponding second node content of the specified node element of determination described second from the web page contents of second webpage, by institute Second node content is stated as the body matter.
10. -9 any method according to claim 1, which is characterized in that the net for obtaining the first webpage to be shown Page element, comprising:
In response to checking instruction, instruction is checked according to described, shows the web page address of at least one webpage of collection;
From the web page address of at least one webpage, the web page address of the first selected webpage is obtained;
From the web page address of first webpage, the web page element of first webpage is obtained.
11. a kind of method of display collection web page contents, which is characterized in that the described method includes:
It shows the collection entry of at least one webpage of collection, includes the net of any webpage in the collection entry of any webpage Page address;
According to the web page address of the first selected webpage, the web page element of first webpage is obtained;
According to the label of the web page element, the non-textual of first webpage is determined from the web page contents of first webpage Content and content of text;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage Content.
12. a kind of device for showing web page contents, which is characterized in that described device includes:
Module is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module determines described for the label according to the web page element from the web page contents of first webpage The non-textual content and content of text of one webpage;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage Content.
13. a kind of device of display collection web page contents, which is characterized in that described device includes:
Display module includes institute in the collection entry of any webpage for showing the collection entry at least one webpage collected State the web page address of any webpage;
Module is obtained, for the web page address according to the first selected webpage, obtains the web page element of first webpage;
Determining module determines described for the label according to the web page element from the web page contents of first webpage The non-textual content and content of text of one webpage;
The display module is also used to the second webpage that display meets preset format, and second webpage includes first net The non-textual content and content of text of page.
14. a kind of terminal, which is characterized in that the terminal includes processor and memory, is stored at least in the memory One instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or described instruction collection Loaded as the processor and executed the side to realize the display web page contents as described in claims 1 to 10 any claim Performed operation in method, or realize that behaviour performed in the method for web page contents is collected in display as claimed in claim 11 Make.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium Item instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or described instruction collection by Processor loads and executes institute in the method to realize the display web page contents as described in claims 1 to 10 any claim The operation of execution, or realize that operation performed in the method for web page contents is collected in display as claimed in claim 11.
CN201711202503.7A 2017-11-27 2017-11-27 Method, device, terminal and storage medium for displaying webpage content Active CN109948095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711202503.7A CN109948095B (en) 2017-11-27 2017-11-27 Method, device, terminal and storage medium for displaying webpage content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711202503.7A CN109948095B (en) 2017-11-27 2017-11-27 Method, device, terminal and storage medium for displaying webpage content

Publications (2)

Publication Number Publication Date
CN109948095A true CN109948095A (en) 2019-06-28
CN109948095B CN109948095B (en) 2022-09-30

Family

ID=67003973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711202503.7A Active CN109948095B (en) 2017-11-27 2017-11-27 Method, device, terminal and storage medium for displaying webpage content

Country Status (1)

Country Link
CN (1) CN109948095B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114020987A (en) * 2022-01-06 2022-02-08 北京微步在线科技有限公司 Sample data acquisition method, device, equipment and storage medium based on webpage

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477564A (en) * 2009-01-21 2009-07-08 北京千家悦网络科技有限公司 Intelligent layout method for displaying wide web page on narrow-screen equipment
CN103150389A (en) * 2013-03-21 2013-06-12 北京奇虎科技有限公司 Method and device for processing matching setting of webpage text contents
CN103345532A (en) * 2013-07-26 2013-10-09 人民搜索网络股份公司 Method and device for extracting webpage information
CN106095985A (en) * 2016-06-20 2016-11-09 网际傲游(北京)科技有限公司 A kind of dynamic collection the method for cluster web pages information
US20170052994A1 (en) * 2015-08-18 2017-02-23 Samsung Electronics Co., Ltd. Method and system for bookmarking a webpage
CN107329985A (en) * 2017-05-31 2017-11-07 北京安云世纪科技有限公司 A kind of collecting method of the page, device and mobile terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477564A (en) * 2009-01-21 2009-07-08 北京千家悦网络科技有限公司 Intelligent layout method for displaying wide web page on narrow-screen equipment
CN103150389A (en) * 2013-03-21 2013-06-12 北京奇虎科技有限公司 Method and device for processing matching setting of webpage text contents
CN103345532A (en) * 2013-07-26 2013-10-09 人民搜索网络股份公司 Method and device for extracting webpage information
US20170052994A1 (en) * 2015-08-18 2017-02-23 Samsung Electronics Co., Ltd. Method and system for bookmarking a webpage
CN106095985A (en) * 2016-06-20 2016-11-09 网际傲游(北京)科技有限公司 A kind of dynamic collection the method for cluster web pages information
CN107329985A (en) * 2017-05-31 2017-11-07 北京安云世纪科技有限公司 A kind of collecting method of the page, device and mobile terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KAMPS J: "Language Models for Searching in Web Corpora", 《THIRTEENTH TEXT RETRIEVAL CONFERENCE》 *
孙莉娜: "基于超链接信息的Web文本聚类方法研究", 《电脑知识与技术(学术交流)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114020987A (en) * 2022-01-06 2022-02-08 北京微步在线科技有限公司 Sample data acquisition method, device, equipment and storage medium based on webpage

Also Published As

Publication number Publication date
CN109948095B (en) 2022-09-30

Similar Documents

Publication Publication Date Title
US11250090B2 (en) Recommended content display method, device, and system
CN110199276A (en) With aspect tag query
CN111064655B (en) Template message pushing method, device, equipment and storage medium
US11853730B2 (en) Mini program data binding method and apparatus, device, and storage medium
CN108920515A (en) Information recommendation method, device, equipment and the storage medium of web displaying process
CN109918669A (en) Entity determines method, apparatus and storage medium
CN110502308A (en) Style sheet switching method, device, computer equipment and storage medium
CN111435377B (en) Application recommendation method, device, electronic equipment and storage medium
CN109522146A (en) The method, apparatus and storage medium of abnormality test are carried out to client
CN112464052A (en) Feedback information processing method, feedback information display device and electronic equipment
CN109413098A (en) Method, apparatus, terminal and the storage medium that web page element is shown
CN110555102A (en) media title recognition method, device and storage medium
CN110245291A (en) A kind of display methods of business datum, device, computer equipment and storage medium
CN110502290A (en) Interface display method, device, display equipment and storage medium
CN109933594A (en) Obtain method, apparatus, electronic equipment and the medium of data
CN109726379A (en) Content item edit methods, device, electronic equipment and storage medium
CN110149408B (en) Service data display method and device, terminal and server
CN109995804A (en) Display methods, information providing method and the device of target resource information
CN111028071B (en) Bill processing method and device, electronic equipment and storage medium
CN109948095A (en) Show method, apparatus, terminal and the storage medium of web page contents
CN111428162A (en) Page screenshot method and device
CN113987326B (en) Resource recommendation method and device, computer equipment and medium
WO2022033432A1 (en) Content recommendation method, electronic device and server
CN106776634A (en) A kind of method for network access, device and terminal device
CN109902089A (en) Querying method, device, electronic equipment and the medium indexed using isomery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant