CN109948095A - Show method, apparatus, terminal and the storage medium of web page contents - Google Patents
Show method, apparatus, terminal and the storage medium of web page contents Download PDFInfo
- Publication number
- CN109948095A CN109948095A CN201711202503.7A CN201711202503A CN109948095A CN 109948095 A CN109948095 A CN 109948095A CN 201711202503 A CN201711202503 A CN 201711202503A CN 109948095 A CN109948095 A CN 109948095A
- Authority
- CN
- China
- Prior art keywords
- webpage
- web page
- content
- node
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of method, apparatus and storage medium for showing web page contents, belong to network technique field.Method includes: the web page element for obtaining the first webpage to be shown;According to the label of the web page element, the non-textual content and content of text of first webpage are determined from the web page contents of first webpage;Display meets the second webpage of preset format, and second webpage includes the non-textual content and content of text of first webpage.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, therefore, when showing the second webpage, it can not only show the content of text of the first webpage, it can also show the non-textual content of the first webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out, accuracy is improved.
Description
Technical field
The present invention relates to network technique field, in particular to a kind of method, apparatus, terminal and storage for showing web page contents
Medium.
Background technique
User consults the web page contents in next time when reading information class web page contents, for the ease of user, many to apply
Software (APP) increases collection function.The collection function of application software can not only collect the web page contents in this application software,
The web page contents in other application software can also be collected.Since the typesetting style of the webpage of each application software is different, in order to
Realize unified style, when user checks the web page contents of some webpage in the application software, terminal to the web page contents into
Row typesetting again, the web page contents after showing typesetting.
Currently, carrying out typesetting to the web page contents by the server of the application, which can be with are as follows: has received when user reads
When the web page contents of some webpage of hiding, terminal to server sends the web page address of the webpage;Server is according to the webpage
Location pulls the web page contents of the webpage;The content of text for extracting the web page contents, according to preset format, in the text of extraction
Hold and carry out typesetting, the content of text after typesetting is sent to terminal.Terminal receives and shows the content of text after typesetting.
In the implementation of the present invention, the inventor finds that the existing technology has at least the following problems:
Server can only extract the content of text in the web page contents in the above method, carry out typesetting to content of text.
Due to being only extracted content of text, and filtered out non-textual content, so as to cause after typesetting web page contents and typesetting before
Web page contents difference is larger namely accuracy is poor.
Summary of the invention
The present invention provides a kind of method, apparatus, terminal and storage mediums for showing web page contents, and it is quasi- to can solve display
The problem of true property difference.Technical solution is as follows:
On the one hand, the present invention provides a kind of methods for showing web page contents, which comprises
Obtain the web page element of the first webpage to be shown;
According to the label of the web page element, the non-of first webpage is determined from the web page contents of first webpage
Content of text and content of text;
Display meets the second webpage of preset format, second webpage include first webpage non-textual content and
Content of text.
On the one hand, the present invention provides a kind of methods of display collection web page contents, which comprises
It shows the collection entry of at least one webpage of collection, includes any webpage in the collection entry of any webpage
Web page address;
According to the web page address of the first selected webpage, the web page element of first webpage is obtained;
According to the label of the web page element, the non-of first webpage is determined from the web page contents of first webpage
Content of text and content of text;
Display meets the second webpage of preset format, second webpage include first webpage non-textual content and
Content of text.
On the one hand, the present invention provides a kind of device for showing web page contents, described device includes:
Module is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module determines institute for the label according to the web page element from the web page contents of first webpage
State the non-textual content and content of text of the first webpage;
Display module, for showing that the second webpage for meeting preset format, second webpage include first webpage
Non-textual content and content of text.
On the one hand, the present invention provides a kind of device of display collection web page contents, described device includes:
Display module wraps in the collection entry of any webpage for showing the collection entry at least one webpage collected
Include the web page address of any webpage;
Module is obtained, for the web page address according to the first selected webpage, obtains the webpage member of first webpage
Element;
Determining module determines institute for the label according to the web page element from the web page contents of first webpage
State the non-textual content and content of text of the first webpage;
The display module, is also used to show and meets the second webpage of preset format, and second webpage includes described the
The non-textual content and content of text of one webpage.
On the one hand, the present invention provides a kind of terminal, the terminal includes processor and memory, is deposited in the memory
Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute
Instruction set is stated to be loaded by the processor and executed to realize behaviour performed in the method for showing web page contents described above
Make.
On the one hand, the present invention provides a kind of terminal, the terminal includes processor and memory, is deposited in the memory
Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute
Instruction set is stated to be loaded by the processor and executed to realize that it is performed in the method for web page contents that display described above is collected
Operation.
On the one hand, it the present invention provides a kind of computer readable storage medium, is deposited in the computer readable storage medium
Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute
Instruction set is stated to be loaded by processor and executed to realize operation performed in the method for showing web page contents described above.
On the one hand, it the present invention provides a kind of computer readable storage medium, is deposited in the computer readable storage medium
Contain at least one instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or institute
Instruction set is stated to be loaded by processor and executed to realize that behaviour performed in the method for web page contents is collected in display described above
Make.
In the method that web page contents are shown provided by the embodiment of the present invention, the webpage of the first webpage to be shown is obtained
Element;According to the label of the web page element, the non-textual content and text of the first webpage are determined from the web page contents of the first webpage
This content.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, in display second
When webpage, the content of text of the first webpage can not only be shown, additionally it is possible to the non-textual content for showing the first webpage, to avoid
It filters out the problem that typesetting context difference is larger caused by non-textual content, improves accuracy.
Detailed description of the invention
Figure 1A is a kind of schematic diagram of implementation environment provided in an embodiment of the present invention;
Figure 1B is the schematic diagram that interface is shared in a kind of display provided in an embodiment of the present invention;
Fig. 1 C is a kind of schematic diagram of web page address for collecting webpage provided in an embodiment of the present invention;
Fig. 1 D is a kind of schematic diagram of display reminding information provided in an embodiment of the present invention;
Fig. 2A is a kind of method flow diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 2 B is a kind of schematic diagram at display collection interface provided in an embodiment of the present invention;
Fig. 2 C is a kind of schematic diagram of web page contents for showing the first webpage provided in an embodiment of the present invention;
Fig. 3 is a kind of method flow diagram of display collection web page contents provided in an embodiment of the present invention;
Fig. 4 A is a kind of apparatus structure schematic diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 4 B is a kind of structural schematic diagram of determining module provided in an embodiment of the present invention;
Fig. 4 C is a kind of apparatus structure schematic diagram for showing web page contents provided in an embodiment of the present invention;
Fig. 4 D is a kind of structural schematic diagram of display module provided in an embodiment of the present invention;
Fig. 5 is a kind of apparatus structure schematic diagram of display collection web page contents provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of terminal provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
The embodiment of the invention provides a kind of implementation environments, include terminal 101 and resource in the implementation environment referring to Figure 1A
Server 102;Terminal 101 can for mobile phone terminal, PAD (Portable Android Device, tablet computer) terminal or
The equipment that any installation such as computer terminal has the application of collection function.The terminal 101 can access resource clothes by network
Business device 102, to obtain service provided by Resource Server 102, which can be web content service.For example, news
Service, public platform service etc..
The terminal 101 can realize the access of Resource Server 102 by any application for being installed in terminal 101,
And when accessing to web page contents provided by Resource Server 102, which can be by with collection function
The content accessed is collected in any application, in case can be by the collection of the collection content when needing to consult next time
It is quickly had access to and is used.Collection refers to the address of web page contents to be collected and/or content-data storage is objective to application
On the corresponding server in family end.Wherein, the application with any application of collection function with above-mentioned carry out web page contents access
It can be the same application, for example, when showing web page contents in applications client, it can be by the applications client
Collection function triggering, it is to realize that the web page address of web page contents and/or content-data storage is corresponding to applications client
Server on, to complete collection to the web page contents.Certainly, there is any application of collection function to carry out net with above-mentioned for this
The application of page access to content can be different application, for example, the money that ought be read in its first applications client by terminal 101
It, can be by such as collection function of the shortcut function in the first applications client when the contents such as news, webpage, public platform article
It triggers to call or trigger the interface of another applications client (the second applications client) with collection function, thus two
Information or data transmission communication/communication link is established between a applications client (the first applications client and the second applications client)
It connects, or sends a default trigger signal to another applications client (the second applications client) with collection function, or
A preset signals are generated by the triggering of the shortcut function with (second answers by another applications client with collection function
With client) it captures or receives.It is corresponding, another applications client (the second applications client) with collection function
Communication/communication connection can be established by the interface and the first applications client;Or second applications client capture, search or
The preset signals that the first applications client is triggered or generated by its shortcut function are received, to what is shown in the first applications client
The contents such as information, webpage, public platform carry out collection processing.It is of course also possible to be, the first applications client is by the net of web page contents
Page address and/or content-data pass to the another application client (the second applications client), to store to the second application
On the corresponding server of client, to complete the collection to the web page contents.How the embodiment of the present invention is specifically realized to above-mentioned
The specific manifestation of collection function and application is defined.Also, when terminal completes the collection to the web page contents, display reminding
Message, which, which is used to indicate, has collected success.
When the second applications client of terminal 101 collects the web page contents shown in the first applications client, terminal 101
The first applications client of front stage operation, and the first applications client shows the web page contents of some webpage, display point in the webpage
Enjoy button.When user wants to collect the web page contents, user can click the sharing button.Terminal 101 detects the sharing
When button is triggered, at least one sharing interface is shown;It include calling the second applications client at least one sharing interface
Interface is collected, which can be replication link or sharing link etc..When user wants to collect the webpage, user can be with
Click the collection interface.When terminal 101 detects that the collection interface is triggered, in the second applications client of front stage operation, and will
On the web page address and/or content-data storage to the corresponding server of the second applications client of the webpage.
For example, with reference to Figure 1B, show that user passes through one pass of terminal 101 (being mobile phone terminal in the present embodiment) reading
When the web page contents of " understand for three minutes: passing by from the whole world of product manager ", want to carry out when user is interested in the content
When collection, it can be triggered by triggering shortcut function and share instruction to terminal 101.Terminal 101 detects the sharing of user's triggering
When instruction, at least one sharing interface is shown, it includes collection interface that at least one, which shares mode, and the collection interface is for collecting this
Webpage.For example, at least one sharing interface include: generate picture share, be shared with by social application good friend, replication link and
Share in information display platform.Wherein, replication link is collection interface.When user wants to collect the web page contents, user
Click replication link.When terminal 101 detects that collection interface is triggered, the web page address and/or content-data of the webpage are obtained,
And on the web page address and/or content-data storage to the corresponding server of applications client of the webpage, to complete to the webpage
The collection of content, referring to Fig. 1 C.Also, when terminal 101 completes the collection to the web page contents, 101 display reminding message of terminal,
The prompt information can be " collecting successfully ", referring to Fig. 1 D.
In terminal 101 by the web page address of web page contents and/or content-data storage to the corresponding service of applications client
When on device, terminal 101 generates the collection entry of the webpage, and the web page address of the webpage, the receipts are included at least in the collection entry
Hiding can also include summary info, web page title and/or the application identities of source application of the webpage of the webpage etc. in entry
Information.Wherein, application identities can be Apply Names etc..
It should be noted that the web page contents collected by any application can by applications client handle to show,
It can also be returned to after treatment by server using visitor by applications client by being interacted between corresponding server
Family end is shown.It is saved in addition, the information such as web page contents of above-mentioned collection can be based on user by server, thus
After allowing the same user to log in applications client by any one terminal 101, browsed by using the collection function
The information collected in any terminal 101.
The embodiment of the invention provides a kind of methods for showing web page contents, and the executing subject of this method can be terminal.
A referring to fig. 2, this method comprises:
201, terminal obtains the web page element of the first webpage to be shown.
First webpage is the webpage that terminal has been collected, and the embodiment of the present invention is only to look into the web page contents collected
It is illustrated for seeing, and the source application of webpage has been collected without limitation to this.Terminal can be stored in advance this
The web page address or web page contents of webpage are collected, if to be stored with web page address, it can be further according to web page address
The web page element of the first webpage is obtained, to avoid the excessive occupancy to terminal storage space, and if storage is web page contents,
The web page element of the first webpage can be then obtained with web page contents according to the pre-stored data.
Wherein, in the embodiment for being stored with web page address, terminal is obtained to be shown by following steps (1) to (3)
The first webpage web page element, the web page element of the first webpage can be webpage HTML (HyperText Markup
Language, HyperText Markup Language) element.
(1) terminal checks instruction according to this, with showing the webpage of at least one webpage of collection in response to checking instruction
Location.
It include checking button in the main interface for the web page contents that terminal is currently shown;When user wants to check some webpage
When web page contents, user can check button by triggering this to check instruction to terminal triggering.Terminal detect this check by
When button is triggered, instruction is checked in response to this, and instruction is checked according to this, shows the web page address of at least one webpage of collection.
When the web page address of at least one webpage of terminal display collection, terminal can show the collection of at least one webpage
Entry includes at least the web page address of the webpage in the collection entry of each webpage, can also include the webpage summary info,
The information such as the application identities of the source of web page title and/or webpage application.User can be according at least one webpage of display
Collection entry, the web page address of webpage to be shown, and the net that will be chosen are chosen in the web page address of at least one webpage
Page address submits to terminal, executes step (2).
For example, with reference to Fig. 2 B, terminal shows the web page address for two webpages collected, respectively " it understands within three minutes: from
The whole world of product manager passes by " and " HTML_ Baidupedia ".
(2) terminal obtains the web page address of the first selected webpage from the web page address of multiple webpages.
For example, user selects " to understand within three minutes: from product manager from the web page address for two webpages that terminal has been collected
The whole world pass by ", terminal obtain " understand within three minutes: being passed by from the whole world of product manager " web page address.
(3) terminal obtains the web page element of the first webpage from the web page address of the first webpage.
Terminal loads the web page address of the first webpage by WebView (network view), to access coming for first webpage
Source server obtains the web page element of the first webpage from the web page address of the first webpage.The web page element of first webpage can be with
For the source code of the first webpage.
In embodiments of the present invention, after terminal gets the web page element of the first webpage, terminal can directly by with
Lower step 202-205 determines the non-textual of the first webpage according to the label of the web page element from the web page contents of the first webpage
Then content and content of text carry out typesetting to non-textual content and content of text.Terminal can also be by the webpage of the first webpage
Non-content element in element is deleted, then just determined from the web page contents of the first webpage the first webpage non-textual content and
Content of text.
The step of terminal deletes the non-content element in the web page element of the first webpage can be with are as follows: terminal is in webpage member
Non-content web page element is determined in element, and non-content web page element is deleted in the web page element, obtains content page element.So
It is based on the content page element afterwards, executes step 202.
Wherein, non-content web page element can be the corresponding web page element of third class label, and third class label includes pattern
Label, genre labels and/or script tag.Correspondingly, terminal determines the step of non-content web page element in the web page element
It can be with are as follows: terminal determines the corresponding web page element of third class label, by third according to the label of web page element in web page element
The corresponding web page element of class label is as non-content web page element.
For example, third class label includes style tags, genre labels and script tag.Style tags can be CSS
(Casading Style Sheets, cascading style sheets), genre labels can be style label, and script tag can be
Script label.Terminal determines the corresponding web page element of CSS label, style according to the label of web page element in web page element
The corresponding web page element of label, the corresponding web page element of script label, by the corresponding web page element of CSS label, style label
Corresponding web page element and the corresponding web page element of script label form non-content web page element.
In embodiments of the present invention, terminal deletes the non-content web page element in the web page element of the first webpage, thus
When subsequent non-textual content determining in the web page contents of the first webpage and content of text, not by these non-content web page elements
The influence of (such as pattern, style and/or script), improves the accuracy of typesetting.
202, terminal constructs the topological structure of the first webpage, each of topological structure according to the web page element of the first webpage
Node element corresponds to a web page element.
Terminal determines the hierarchical relationship between each web page element of the first webpage, is closed according to the level between each webpage
System constructs the topological structure of the first webpage.Wherein, topological structure can be DOM (Document Object Model, document pair
As changing model) tree (tree).DOM tree, which refers to, is parsed the web page element of the first webpage by DOM, and generates webpage
The tree structure of element, the corresponding web page element of a node element of a tree structure, and the node of the node element
The web page contents of the web page element are stored in label.
In a possible implementation, for terminal before the topological structure for constructing the first webpage, terminal can not also
Non-content element in the web page element of first webpage is deleted, and in the web page element according to the first webpage, construct the first net
When the topological structure of page, ignore the non-content element of the first webpage, and constructs topology knot according only to the content element of the first webpage
Structure.
In embodiments of the present invention, terminal constructs the topological structure of the first webpage, thus after being carried out based on the topological structure
Continuous processing, to improve subsequent typesetting efficiency.
203, terminal determines that the first node element, the first node element are the element of non-textual content in the topological structure
Node.
It include label, and the label of the label of content of text and non-textual content in the corresponding web page element of node element
It is different.Terminal can determine the first node element by the label of web page element in the topological structure.Correspondingly, this step
It can be with are as follows: terminal determines the corresponding node element of first kind label in the topological structure, and/or, in the topological structure really
The fixed corresponding node element of second class label, first kind label include quoting label, form tag and/or code block label, the
Two class labels include customized label;By the corresponding node element of first kind label and/or the corresponding element section of the second class label
Point is used as the first node element.
Terminal traverses the corresponding web page element of each node element in the topological structure, corresponding in each node element
It determines that label is the web page element of first kind label in web page element, the corresponding node element of the web page element determined is determined
For the corresponding node element of first kind label.Equally, terminal traverses the corresponding webpage of each node element in the topological structure
Element determines that label is the web page element of the second class label in the corresponding web page element of each node element, by what is determined
The corresponding node element of web page element is determined as the corresponding node element of the second class label.
For example, first kind label includes reference label, form tag and code block label.Quote label be <
Blockquote>, form tag be<table>, code block label be<code>with<pre>deng.Terminal is in the topological structure
Determine that label is<blockquote>,<table>,<code>with<pre>node element, using the node element determined as
First node element.Second class label includes customized label, for example, audio tag, video tab, picture tag etc..
Since the coding mode of different application programs is different, different application can correspond to the different first kind
Label and the second class label, before this step, for terminal it needs to be determined that first kind label and/or the second class label, which can
With are as follows: terminal obtains the application identities of the source application of the first webpage;According to the application identities, determination is corresponding with the application identities
First kind label and/or the second class label.
Terminal is when collecting the first webpage, the collection entry of the first webpage of terminal storage, includes first in the collection entry
The application identities of the source application of webpage.Therefore, terminal directly obtains coming for the first webpage from the collection entry of the first webpage
The application identities of source application.
Different application programs corresponds to different first kind labels.Before this step, terminal obtains multiple source applications
Application identities and each application first kind label, and store the application identities of each source application and pair of first kind label
It should be related to.Correspondingly, terminal the step of according to the application identities, determining first kind label corresponding with the application identities, can be with
Are as follows: terminal is corresponding with the application identities from obtaining in the corresponding relationship of application identities and first kind label according to the application identities
First kind label.
Equally, different application programs corresponds to the second different class labels.Before this step, terminal obtains multiple sources
Second class label of the application identities of application and each other application, and store the application identities and the second class of each source application
The corresponding relationship of label.Correspondingly, terminal determines the step of the second class label corresponding with the application identities according to the application identities
It suddenly can be with are as follows: terminal is marked from obtaining in the corresponding relationship of application identities and the second class label with the application according to the application identities
Know corresponding second class label.
In embodiments of the present invention, after terminal determines the first node element in the topological structure, the first net is stored
The web page address of page and the corresponding relationship of the first node element, in order to which subsequent terminal shows the web page contents of the first webpage again
When, directly according to the web page address of the first webpage, from the corresponding relationship of the web page address of the first webpage and the first node element
The first node element for obtaining the first webpage does not need to carry out above-mentioned identification process again, to improve recognition efficiency, in turn
Improve subsequent typesetting efficiency.
204, terminal determines that second element node, second element node are the element section of content of text in the topological structure
Point.
It include label, and the label of the label of content of text and non-textual content in the corresponding web page element of node element
It is different.Terminal can determine second element node by the label of web page element in the topological structure.Correspondingly, this step
It can be with are as follows: terminal determines the corresponding node element of the 4th class label in the topological structure, by the corresponding element of the 4th class label
Node is as second element node.4th class label is text label.
For example, the 4th class label includes text label, text label is<class>etc..Terminal is true in the topological structure
Calibration label are the node element of<class>, using the node element determined as second element node.
Since the coding mode of different application programs is different, different application can correspond to the 4th different classes
Label, terminal can also obtain the source of the first webpage in the way of above-mentioned acquisition first kind label and/or the second class label
The corresponding 4th class label of the application identities of application.
Due to including the first node element and second element node in the topological structure.First yuan is had confirmed in terminal
In the case where plain node, terminal can be using the other elements node in the topological structure in addition to the first node element as second
Node element.
Equally, after terminal determines second element node in the topological structure, the web page address of the first webpage is stored
With the corresponding relationship of second element node, when showing the web page contents of the first webpage again in order to subsequent terminal, direct basis
The web page address of first webpage, from obtaining the first webpage in the corresponding relationship of the web page address of the first webpage and second element node
Second element node, do not need to carry out above-mentioned identification process again, to improve recognition efficiency, and then improve subsequent row
Version efficiency.
It should be noted that step 203 and 204 not stringent chronological orders, can first carry out step 203, then
Execute step 204;Step 204 can also be first carried out, then executes step 203;Step can also be performed simultaneously by two processes
203 and 204, it is not limited in the embodiment of the present invention.
205, terminal obtains non-textual content from the node label of the first node element, from the node of second element node
Content of text is obtained in label.
It include web page contents in the node label of node element, terminal obtains non-from the node label of the first node element
Content of text obtains content of text from the node label of second element node, executes step 206 to non-textual content and text
Content carries out typesetting.
206, terminal shows the second webpage for meeting preset format, the second webpage include the first webpage non-textual content and
Content of text.
Terminal directly can carry out typesetting, namely the first following implementation to non-textual content and content of text;Eventually
End can also identify body matter from non-textual content and content of text, only carry out typesetting to body matter, namely following
Second of implementation.
For the first implementation, this step can be with are as follows: non-textual content and content of text are formed the second net by terminal
The web page contents of page, display meet the web page contents of the second webpage of the first preset format.
First preset format includes the first non-textual content display format and the first content of text display format, the first text
Content display format includes the first paragraph format and/or the first font format.First non-textual content display format includes first
Quote display format, the first Tabular display format and/or first generation code block display format.
In embodiments of the present invention, terminal recognition goes out after non-textual content and content of text, directly to content of text and
Non-textual content carries out typesetting, improves the efficiency of typesetting.Also, due to not filtering out non-textual content, to improve
The readability of the web page contents of second webpage.In addition, terminal to the second webpage web page contents carry out typesetting when, will not be to original
Text is edited, and is not changed the original text meaning, is respected fully original text author.
For example, the figure A in Fig. 2 C is the web page contents of the first webpage before typesetting, the font of the web page contents of the first webpage
Format is the Song typeface, and the font size of title and the font size of recommendation information are No. 14, and the font size of body matter is 12
Number.Terminal is adjusted the font format and size of the web page contents of the first webpage, by the word of the web page contents of the first webpage
Physique formula is modified for regular script, and sets No. 8 for the font size of recommendation information, obtains the web page contents of the second webpage, joins
See the figure B in Fig. 2 C.
For second of implementation, this step can be realized by following steps (1) to (3), comprising:
(1) the non-textual content and content of text are formed the web page contents of the second webpage by terminal.
(2) terminal identifies body matter from the web page contents of the second webpage.
Terminal can identify the body matter in the web page contents of the second webpage by the first default regular expression, the
The one default regular expression body matter in web page contents for identification;Namely following first way;Terminal can also lead to
It crosses the second default regular expression and identifies that body matter in the web page contents of the second webpage, the second default regular expression are used
Non- body matter namely the following second way in identification web page contents;Terminal can also be according to where body matter
The body matter in the web page contents of the second webpage, namely the third following mode are identified in region.Terminal can also be according to member
The label of plain node identifies the body matter in the web page contents of the second webpage namely following 4th kind of mode.
First way, step (2) can be with are as follows: terminal is by the first default regular expression, from the webpage of the second webpage
Body matter is identified in content.
Terminal identifies the second specified element section by the first default regular expression from the node element of the second webpage
Point;The corresponding second node content of the specified node element of determination second from the web page contents of the second webpage, will be in second node
Hold and is used as body matter.
It include at least one first label in first default regular expression, wherein the first label is corresponding for body matter
Label.Correspondingly, terminal identifies second from each node element of the second webpage by the first default regular expression
The step of specified node element, can be with are as follows: terminal traverses each element section of the second webpage by the first default regular expression
The label of point, determines the second specified node element of label and the first default regular expression matching.
For example, the first label is label A, label B and label C;Then the first default regular expression can be label A or
Label B or label C.
The second way, step (2) can be with are as follows: terminal is by the second default regular expression, from the webpage of the second webpage
Non- body matter is identified in content, by the web page contents of the second webpage unless the content except body matter is determined as text
Content.Wherein, the second default regular expression non-body matter in web page contents for identification.Non- body matter includes recommending
Link (such as advertisement), etc..
Second default regular expression includes at least one second label;Terminal passes through the second default regular expression, from
Identify that third specifies node element in the node element of second webpage;Determine that third is specified from the web page contents of the second webpage
The corresponding third node content of node element, using third node content as non-body matter.Wherein, the second label is non-text
The corresponding label of content.Correspondingly, terminal is identified from the node element of the second webpage by the second default regular expression
Third specifies the step of node element can be with are as follows: terminal traverses each member of the second webpage by the second default regular expression
The label of plain node determines that the third of label and the second default regular expression matching specifies node element.
In a possible implementation, the second default regular expression can also include at least one keyword, often
A keyword is the corresponding keyword of non-body matter.For example, keyword can be " guessing that you like ", " purchase " etc..Correspondingly,
For terminal by the second default regular expression, the step of identifying non-body matter from the web page contents of the second webpage, can be with
Are as follows: the web page contents of the second webpage are divided into multiple content blocks by terminal, determine each content blocks and the second default regular expressions
Matching degree between formula;According to the matching degree between each content blocks and the second default regular expression, from multiple content blocks
Selecting matching degree is more than the content blocks of preset threshold, and the content blocks that matching degree is more than preset threshold are determined as non-body matter.
Wherein, terminal can be using a paragraph in web page contents as a content blocks.
Preset threshold, which can according to need, to be configured and changes, and is not made to preset threshold in embodiments of the present invention specific
It limits.For example, preset threshold can be 80% or 85% etc..
The third mode, since body matter is normally at the middle region of webpage, terminal can according to it is preset just
Body matter is identified from the web page contents of the second webpage in literary region.Correspondingly, step (2) can be with are as follows: terminal is
Specified region is determined in two webpages, using the web page contents in specified region as body matter.
Since the webpage layout of different application is different, correspondingly, the step of terminal determines specified region in the second webpage
It can be with are as follows: terminal obtains the application identities of the source application of the first webpage;According to application identities determination and the application identities pair
The specified region answered determines the specified region in the second webpage.
4th kind of mode, step (2) can be with are as follows: terminal determines the weight of each node element of the second webpage, according to every
The weight of a node element determines the first specified node element, and the first specified element is determined from the web page contents of the second webpage
The corresponding first node content of node, using first node content as body matter.
In embodiments of the present invention, terminal can determine the power of node element in conjunction with the label and node content of node element
Weight.Correspondingly, terminal determines that the step of weight of each node element of the second webpage can be by following steps (2-1) extremely
(2-4) is realized, comprising:
(2-1) terminal determines the tag types of each node element and the corresponding node content of each node element includes
Number of words.
(2-2) terminal determines the first weight of each node element according to the tag types of each node element.
The corresponding relationship of each tag types and weight is stored in terminal;Correspondingly, this step can be with are as follows: terminal is according to every
The tag types of a node element, from the first weight for obtaining each node element in the corresponding relationship of tag types and weight.
The number of words that (2-3) terminal includes according to the corresponding node content of each node element, determines each node element
Second weight.
The corresponding relationship of number of words and weight is stored in terminal;Correspondingly, this step can be with are as follows: terminal is according to each element section
The number of words that the corresponding node content of point includes, from the second power for obtaining each node element in the corresponding relationship of number of words and weight
Weight.
In embodiments of the present invention, the corresponding relationship of terminal storage number of words and weight, terminal are corresponding according to each element
The number of words that node content includes, from the second weight for obtaining each node element in the corresponding relationship of number of words and weight, to mention
The accuracy of second weight of the high each node element determined.
The corresponding relationship of number of words range and weight can also be stored in terminal;Correspondingly, this step can be with are as follows: terminal according to
The number of words and stored number of words range that the corresponding node content of each node element includes, determine that each node element is corresponding
Number of words range where the number of words that node content includes, according to the corresponding number of words range of each node element, from number of words range and
The second weight of each node element is obtained in the corresponding relationship of weight.
In embodiments of the present invention, the corresponding relationship of number of words range and weight is stored in terminal, terminal is according to each element
The corresponding relationship of number of words and the number of words range and weight that the corresponding node content of node includes, determines each node element
Second weight.To not need to store the corresponding relationship of each number of words and weight in terminal, memory space is saved.
(2-4) terminal determines the weight of each node element according to the first weight and the second weight of each node element.
For each node element, terminal determines the first coefficient of the first weight and the second coefficient of the second weight, determines
First weight of the node element and the product of the first coefficient obtain the first numerical value, determine the second weight and of the node element
The product of two coefficients obtains second value, the weight by the sum of the first numerical value and second value as the node element.
(3) terminal shows the body matter for meeting the second webpage of the second preset format.
Second preset format and the first preset format can be identical, can not also be identical.Also, the second preset format includes
Second non-textual content display format and the second content of text display format, the second content of text display format include the second paragraph
Format and/or the second font format.Second non-textual content display format is shown including the second reference display format, the second table
Format and/or second generation code block display format.
For example, with reference to the figure C in Fig. 2 C, terminal filters out the non-body matter in the figure B in Fig. 2 C, only display the
The body matter of two web page contents.
In embodiments of the present invention, terminal identifies body matter from the web page contents of the second webpage, to filter out
The non-body matter such as ad content and/or recommendation, bothers user so as to avoid non-Chinese content, improves user
Viscosity.Also, typesetting is carried out to web page contents by terminal, to alleviate the concurrent capability of server.
After terminal shows the second webpage, the net of the web page address of the first webpage of terminal storage and the second webpage after typesetting
Page content, when showing the second webpage again in order to subsequent terminal, directly according to the web page address of the first webpage, from the first webpage
Web page address and the second webpage web page contents corresponding relationship in obtain the web page contents of the second webpage, show the second webpage
Web page contents, do not need to carry out above-mentioned typesetting process again, improve typesetting efficiency.
In the method that web page contents are shown provided by the embodiment of the present invention, terminal obtains the first webpage to be shown
Web page element;According to the label of the web page element, the non-textual content of the first webpage is determined from the web page contents of the first webpage
And content of text.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, it is showing
When the second webpage, the content of text of the first webpage can not only be shown, additionally it is possible to show the non-textual content of the first webpage, thus
The problem that typesetting context difference is larger caused by filtering out non-textual content is avoided, accuracy is improved.
The embodiment of the invention provides a kind of method of display collection web page contents, the executing subject of this method is terminal,
Referring to Fig. 3, this method comprises:
301, terminal shows the collection entry of at least one webpage of collection, includes any in the collection entry of any webpage
The web page address of webpage.
It include checking button in the main interface for the web page contents that terminal is currently shown;When user wants to check some webpage
When web page contents, user can check button by triggering this to check instruction to terminal triggering.Terminal detect this check by
When button is triggered, instruction is checked in response to this, instruction is checked according to this, shows the collection entry of at least one webpage, each net
The web page address of the webpage is included at least in the collection entry of page, can also including the summary info of the webpage, web page title and/
Or the information such as application identities of source application of the webpage.User can according to the collection entry of at least one webpage of display,
The web page address of webpage to be shown is chosen in the web page address of at least one webpage, and the web page address chosen is submitted to
Terminal.
302, terminal obtains the web page element of the first webpage according to the web page address of the first selected webpage.
This step is identical with the process of the web page element of the first webpage of acquisition in step 201, and details are not described herein.
303, terminal determines the non-textual of the first webpage according to the label of web page element from the web page contents of the first webpage
Content and content of text.
This step can realize that details are not described herein by above step 202-205.
304, terminal shows the second webpage for meeting preset format, the second webpage include the first webpage non-textual content and
Content of text.
This step and step 206 are identical, and details are not described herein.
In the method for showing collection web page contents provided by the embodiment of the present invention, terminal display collection at least one
The collection entry of webpage includes the web page address of any webpage in the collection entry of any webpage.When user wants to read some
When the web page contents of webpage, user can click the web page address of some webpage.Terminal is according to the net of the first selected webpage
Page address obtains the web page element of the first webpage;According to the label of the web page element, determined from the web page contents of the first webpage
The non-textual content and content of text of first webpage.Due to that can determine content of text and non-from the web page contents of the first webpage
Therefore content of text when showing the second webpage, can not only show the content of text of the first webpage, additionally it is possible to show first
The non-textual content of webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out,
Improve accuracy.
The embodiment of the invention provides a kind of device for showing web page contents, which is applied in the terminal, for executing
The step of terminal in the method for above-mentioned display web page contents executes.A referring to fig. 4, the device include:
Module 401 is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module 402 determines the first net for the label according to the web page element from the web page contents of the first webpage
The non-textual content and content of text of page;
Display module 403 shows the second webpage for carrying out typesetting to the non-textual content and text content.
In a possible implementation, B referring to fig. 4, determining module 402, comprising:
Construction unit 4021, for according to the web page element, constructing the topological structure of the first webpage, the topological structure it is every
A node element corresponds to a web page element;
Determination unit 4022, for determining that the first node element, the first node element are non-textual in the topological structure
The node element of content;
Determination unit 4022 is also used to determine that second element node, second element node are text in the topological structure
The node element of content;
Acquiring unit 4023, for obtaining the non-textual content from the node label of the first node element, from second yuan
Text content is obtained in the node label of plain node.
In a possible implementation, determination unit 4022 is also used to determine the first category in the topological structure
Corresponding node element is signed, and/or, the corresponding node element of the second class label, first kind label are determined in the topological structure
Including reference label, form tag and/or code block label, the second class label includes customized label;By first kind label pair
The node element and/or the corresponding node element of the second class label answered are as the first node element.
In a possible implementation, determination unit 4022 is also used to obtain answering for the source application of the first webpage
With mark;According to the application identities, first kind label corresponding with the application identities and/or the second class label are determined.
In a possible implementation, C referring to fig. 4, the device further include:
Removing module 404 will be non-interior in the web page element for determining non-content web page element in the web page element
Hold web page element to delete.
In a possible implementation, D referring to fig. 4, display module 403, comprising:
Component units 4031, for the non-textual content and text content to be formed to the web page contents of second webpage;
Recognition unit 4032, for identifying body matter from the web page contents of second webpage;
Display unit 4033, for showing that the second webpage for meeting preset format, the second webpage include the non-of the first webpage
Content of text and content of text.
In a possible implementation, recognition unit 4032 is also used to by presetting regular expression, from the second net
The body matter is identified in the web page contents of page, the default regular expression body matter in web page contents for identification;
And/or
Recognition unit 4032 is also used to determine the weight of each node element of the second webpage, according to each node element
Weight, determine the first specified node element, from the web page contents of the second webpage determine the first specified node element it is corresponding
First node content, using first node content as body matter.
In a possible implementation, recognition unit 4032 is also used to determine the tag types of each node element
The number of words that node content corresponding with each node element includes;According to the tag types of each node element, each member is determined
First weight of plain node;According to the number of words that the corresponding node content of each node element includes, each node element is determined
Second weight;According to the first weight and the second weight of each node element, the weight of each node element is determined.
In a possible implementation, recognition unit 4032 is also used to by presetting regular expression, from the second net
The second specified node element is identified in the node element of page;The second specified element section is determined from the web page contents of the second webpage
The corresponding second node content of point, using second node content as body matter.
In a possible implementation, module 401 is obtained, is also used to check finger according to this in response to checking instruction
It enables, shows the web page address of at least one webpage of collection;From the web page address of at least one webpage, selected is obtained
The web page address of one webpage;From the web page address of the first webpage, the web page element of the first webpage is obtained.
In the method that web page contents are shown provided by the embodiment of the present invention, the webpage of the first webpage to be shown is obtained
Element;According to the label of the web page element, the non-textual content and text of the first webpage are determined from the web page contents of the first webpage
This content.Due to that can determine content of text and non-textual content from the web page contents of the first webpage, in display second
When webpage, the content of text of the first webpage can not only be shown, additionally it is possible to the non-textual content for showing the first webpage, to avoid
It filters out the problem that typesetting context difference is larger caused by non-textual content, improves accuracy.
The embodiment of the invention provides a kind of devices of display collection web page contents, and referring to Fig. 5, which includes:
Display module 501, the collection entry of at least one webpage for showing collection, in the collection entry of any webpage
Web page address including any webpage;
Module 502 is obtained, for the web page address according to the first selected webpage, obtains the webpage member of the first webpage
Element;
Determining module 503 determines the first webpage for the label according to web page element from the web page contents of the first webpage
Non-textual content and content of text;
Display module 501 is also used to the second webpage that display meets preset format, and the second webpage includes the non-of the first webpage
Content of text and content of text.
In the method for showing collection web page contents provided by the embodiment of the present invention, terminal display collection at least one
The collection entry of webpage includes the web page address of any webpage in the collection entry of any webpage.When user wants to read some
When the web page contents of webpage, user can click the web page address of some webpage.Terminal is according to the net of the first selected webpage
Page address obtains the web page element of the first webpage;According to the label of the web page element, determined from the web page contents of the first webpage
The non-textual content and content of text of first webpage.Due to that can determine content of text and non-from the web page contents of the first webpage
Therefore content of text when showing the second webpage, can not only show the content of text of the first webpage, additionally it is possible to show first
The non-textual content of webpage, so as to avoid the problem that typesetting context difference is larger caused by non-textual content is filtered out,
Improve accuracy.
It should be understood that it is provided by the above embodiment display web page contents device when showing web page contents, only with
The division progress of above-mentioned each functional module can according to need and for example, in practical application by above-mentioned function distribution by not
Same functional module is completed, i.e., the internal structure of device is divided into different functional modules, to complete whole described above
Or partial function.In addition, the device of display web page contents provided by the above embodiment and the method for display web page contents are implemented
Example belongs to same design, and specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Fig. 6 shows the structural block diagram of the terminal 600 of an illustrative embodiment of the invention offer.The terminal 600 can be with
Be: smart phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III,
Dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer
IV, dynamic image expert's compression standard audio level 4) player, laptop or desktop computer.Terminal 600 be also possible to by
Referred to as other titles such as user equipment, portable terminal, laptop terminal, terminal console.
In general, terminal 600 includes: processor 601 and memory 602.
Processor 601 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place
Reason device 601 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 601 also may include primary processor and coprocessor, master
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?
In some embodiments, processor 601 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 601 can also be wrapped
AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning
Calculating operation.
Memory 602 may include one or more computer readable storage mediums, which can
To be non-transient.Memory 602 may also include high-speed random access memory and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 602 can
Storage medium is read for storing at least one instruction, at least one instruction performed by processor 601 for realizing this Shen
Please in embodiment of the method provide display web page contents method.
In some embodiments, terminal 600 is also optional includes: peripheral device interface 603 and at least one peripheral equipment.
It can be connected by bus or signal wire between processor 601, memory 602 and peripheral device interface 603.Each peripheral equipment
It can be connected by bus, signal wire or circuit board with peripheral device interface 603.Specifically, peripheral equipment includes: radio circuit
604, at least one of touch display screen 605, camera 606, voicefrequency circuit 607, positioning component 608 and power supply 609.
Peripheral device interface 603 can be used for I/O (Input/Output, input/output) is relevant outside at least one
Peripheral equipment is connected to processor 601 and memory 602.In some embodiments, processor 601, memory 602 and peripheral equipment
Interface 603 is integrated on same chip or circuit board;In some other embodiments, processor 601, memory 602 and outer
Any one or two in peripheral equipment interface 603 can realize on individual chip or circuit board, the present embodiment to this not
It is limited.
Radio circuit 604 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetrates
Frequency circuit 604 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 604 turns electric signal
It is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 604 wraps
It includes: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, codec chip
Group, user identity module card etc..Radio circuit 604 can be carried out by least one wireless communication protocol with other terminals
Communication.The wireless communication protocol includes but is not limited to: WWW, Metropolitan Area Network (MAN), Intranet, each third generation mobile communication network (2G, 3G,
4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, it penetrates
Frequency circuit 604 can also include NFC (Near Field Communication, wireless near field communication) related circuit, this
Application is not limited this.
Display screen 605 is for showing UI (User Interface, user interface).The UI may include figure, text, figure
Mark, video and its their any combination.When display screen 605 is touch display screen, display screen 605 also there is acquisition to show
The ability of the touch signal on the surface or surface of screen 605.The touch signal can be used as control signal and be input to processor
601 are handled.At this point, display screen 605 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/or
Soft keyboard.In some embodiments, display screen 605 can be one, and the front panel of terminal 600 is arranged;In other embodiments
In, display screen 605 can be at least two, be separately positioned on the different surfaces of terminal 600 or in foldover design;In still other reality
It applies in example, display screen 605 can be flexible display screen, be arranged on the curved surface of terminal 600 or on fold plane.Even, it shows
Display screen 605 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 605 can use LCD (Liquid
Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode)
Etc. materials preparation.
CCD camera assembly 606 is for acquiring image or video.Optionally, CCD camera assembly 606 include front camera and
Rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.One
In a little embodiments, rear camera at least two is main camera, depth of field camera, wide-angle camera, focal length camera shooting respectively
Any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide-angle
Camera fusion realizes that pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are clapped
Camera shooting function.In some embodiments, CCD camera assembly 606 can also include flash lamp.Flash lamp can be monochromatic warm flash lamp,
It is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for not
With the light compensation under colour temperature.
Voicefrequency circuit 607 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and will
Sound wave, which is converted to electric signal and is input to processor 601, to be handled, or is input to radio circuit 604 to realize voice communication.
For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 600 to be multiple.Mike
Wind can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 601 or radio circuit will to be come from
604 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramic loudspeaker.When
When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunications
Number the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 607 can also include
Earphone jack.
Positioning component 608 is used for the current geographic position of positioning terminal 600, to realize navigation or LBS (Location
Based Service, location based service).Positioning component 608 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), China dipper system or Russia Galileo system positioning group
Part.
Power supply 609 is used to be powered for the various components in terminal 600.Power supply 609 can be alternating current, direct current,
Disposable battery or rechargeable battery.When power supply 609 includes rechargeable battery, which can be wired charging electricity
Pond or wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless
The battery of coil charges.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 600 further includes having one or more sensors 610.The one or more sensors
610 include but is not limited to: acceleration transducer 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614,
Optical sensor 615 and proximity sensor 616.
The acceleration that acceleration transducer 611 can detecte in three reference axis of the coordinate system established with terminal 600 is big
It is small.For example, acceleration transducer 611 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 601 can
With the acceleration of gravity signal acquired according to acceleration transducer 611, touch display screen 605 is controlled with transverse views or longitudinal view
Figure carries out the display of user interface.Acceleration transducer 611 can be also used for the acquisition of game or the exercise data of user.
Gyro sensor 612 can detecte body direction and the rotational angle of terminal 600, and gyro sensor 612 can
To cooperate with acquisition user to act the 3D of terminal 600 with acceleration transducer 611.Processor 601 is according to gyro sensor 612
Following function may be implemented in the data of acquisition: when action induction (for example changing UI according to the tilt operation of user), shooting
Image stabilization, game control and inertial navigation.
The lower layer of side frame and/or touch display screen 605 in terminal 600 can be set in pressure sensor 613.Work as pressure
When the side frame of terminal 600 is arranged in sensor 613, user can detecte to the gripping signal of terminal 600, by processor 601
Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 613 acquires.When the setting of pressure sensor 613 exists
When the lower layer of touch display screen 605, the pressure operation of touch display screen 605 is realized to UI circle according to user by processor 601
Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menu
At least one of control.
Fingerprint sensor 614 is used to acquire the fingerprint of user, collected according to fingerprint sensor 614 by processor 601
The identity of fingerprint recognition user, alternatively, by fingerprint sensor 614 according to the identity of collected fingerprint recognition user.It is identifying
When the identity of user is trusted identity out, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 601
Include solution lock screen, check encryption information, downloading software, payment and change setting etc..Terminal can be set in fingerprint sensor 614
600 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 600, fingerprint sensor 614 can be with
It is integrated with physical button or manufacturer Logo.
Optical sensor 615 is for acquiring ambient light intensity.In one embodiment, processor 601 can be according to optics
The ambient light intensity that sensor 615 acquires controls the display brightness of touch display screen 605.Specifically, when ambient light intensity is higher
When, the display brightness of touch display screen 605 is turned up;When ambient light intensity is lower, the display for turning down touch display screen 605 is bright
Degree.In another embodiment, the ambient light intensity that processor 601 can also be acquired according to optical sensor 615, dynamic adjust
The acquisition parameters of CCD camera assembly 606.
Proximity sensor 616, also referred to as range sensor are generally arranged at the front panel of terminal 600.Proximity sensor 616
For acquiring the distance between the front of user Yu terminal 600.In one embodiment, when proximity sensor 616 detects use
When family and the distance between the front of terminal 600 gradually become smaller, touch display screen 605 is controlled from bright screen state by processor 601
It is switched to breath screen state;When proximity sensor 616 detects user and the distance between the front of terminal 600 becomes larger,
Touch display screen 605 is controlled by processor 601 and is switched to bright screen state from breath screen state.
It will be understood by those skilled in the art that the restriction of structure shown in Fig. 6 not structure paired terminal 600, can wrap
It includes than illustrating more or fewer components, perhaps combine certain components or is arranged using different components.
The embodiment of the invention also provides a kind of computer readable storage medium, which is applied to
Terminal is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the computer readable storage medium, should
Instruction, the program, the code set or the instruction set are loaded by processor and are executed in the display webpage to realize above-described embodiment
Operation performed by terminal in the method for appearance.
The embodiment of the invention also provides a kind of computer readable storage medium, which is applied to
Terminal is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the computer readable storage medium, should
Instruction, the program, the code set or the instruction set, which are loaded by processor and executed, collects net with the display for realizing above-described embodiment
Operation performed by terminal in the method for page content.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (15)
1. a kind of method for showing web page contents, which is characterized in that the described method includes:
Obtain the web page element of the first webpage to be shown;
According to the label of the web page element, the non-textual of first webpage is determined from the web page contents of first webpage
Content and content of text;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage
Content.
2. the method according to claim 1, wherein the label according to the web page element, from described
The non-textual content and content of text of first webpage are determined in the web page contents of one webpage, comprising:
According to the web page element, the topological structure of first webpage, each node element pair of the topological structure are constructed
Ying Yuyi web page element;
Determine that the first node element, first node element are the node element of non-textual content in the topological structure;
Determine that second element node, the second element node are the node element of content of text in the topological structure;
The non-textual content is obtained from the node label of first node element, from the node of the second element node
The content of text is obtained in label.
3. according to the method described in claim 2, it is characterized in that, described determine the first element section in the topological structure
Point, comprising:
The corresponding node element of first kind label is determined in the topological structure, and/or, the is determined in the topological structure
The corresponding node element of two class labels, the first kind label include reference label, form tag and/or code block label, institute
Stating the second class label includes customized label;
Using the corresponding node element of the first kind label and/or the corresponding node element of the second class label as described
One node element.
4. according to the method described in claim 3, it is characterized in that, the method also includes:
Obtain the application identities of the source application of first webpage;
According to the application identities, first kind label corresponding with the application identities and/or the second class label are determined.
5. the method according to claim 1, wherein the label according to the web page element, from described
Before the non-textual content and the content of text that determine first webpage in the web page contents of one webpage, the method also includes:
Non-content web page element is determined in the web page element, deletes the non-content web page element in the web page element
It removes.
6. described the method according to claim 1, wherein described show meets the second webpage of preset format
Second webpage includes the non-textual content and content of text of first webpage, comprising:
The non-textual content and the content of text are formed to the web page contents of second webpage;
Body matter is identified from the web page contents of second webpage;
Display meets the body matter of second webpage of preset format.
7. according to the method described in claim 6, it is characterized in that, described identify from the web page contents of second webpage
Body matter, comprising:
By default regular expression, the body matter is identified from the web page contents of second webpage, it is described default
The regular expression body matter in web page contents for identification;And/or
The weight for determining each node element of second webpage determines first according to the weight of each node element
Specified node element, the corresponding first node of the specified node element of determination described first from the web page contents of second webpage
Content, using the first node content as the body matter.
8. the method according to the description of claim 7 is characterized in that each of the web page contents of the determination second webpage
The weight of node element, comprising:
Determine the tag types of each node element and the number of words that the corresponding node content of each node element includes;
According to the tag types of each node element, the first weight of each node element is determined;
According to the number of words that the corresponding node content of each node element includes, the second power of each node element is determined
Weight;
According to the first weight and the second weight of each node element, the weight of each node element is determined.
9. the method according to the description of claim 7 is characterized in that it is described by preset regular expression, from second net
The body matter is identified in the web page contents of page, comprising:
By presetting regular expression, the second specified node element is identified from the node element of second webpage;
The corresponding second node content of the specified node element of determination described second from the web page contents of second webpage, by institute
Second node content is stated as the body matter.
10. -9 any method according to claim 1, which is characterized in that the net for obtaining the first webpage to be shown
Page element, comprising:
In response to checking instruction, instruction is checked according to described, shows the web page address of at least one webpage of collection;
From the web page address of at least one webpage, the web page address of the first selected webpage is obtained;
From the web page address of first webpage, the web page element of first webpage is obtained.
11. a kind of method of display collection web page contents, which is characterized in that the described method includes:
It shows the collection entry of at least one webpage of collection, includes the net of any webpage in the collection entry of any webpage
Page address;
According to the web page address of the first selected webpage, the web page element of first webpage is obtained;
According to the label of the web page element, the non-textual of first webpage is determined from the web page contents of first webpage
Content and content of text;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage
Content.
12. a kind of device for showing web page contents, which is characterized in that described device includes:
Module is obtained, for obtaining the web page element of the first webpage to be shown;
Determining module determines described for the label according to the web page element from the web page contents of first webpage
The non-textual content and content of text of one webpage;
Display meets the second webpage of preset format, and second webpage includes the non-textual content and text of first webpage
Content.
13. a kind of device of display collection web page contents, which is characterized in that described device includes:
Display module includes institute in the collection entry of any webpage for showing the collection entry at least one webpage collected
State the web page address of any webpage;
Module is obtained, for the web page address according to the first selected webpage, obtains the web page element of first webpage;
Determining module determines described for the label according to the web page element from the web page contents of first webpage
The non-textual content and content of text of one webpage;
The display module is also used to the second webpage that display meets preset format, and second webpage includes first net
The non-textual content and content of text of page.
14. a kind of terminal, which is characterized in that the terminal includes processor and memory, is stored at least in the memory
One instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or described instruction collection
Loaded as the processor and executed the side to realize the display web page contents as described in claims 1 to 10 any claim
Performed operation in method, or realize that behaviour performed in the method for web page contents is collected in display as claimed in claim 11
Make.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium
Item instruction, at least a Duan Chengxu, code set or instruction set, described instruction, described program, the code set or described instruction collection by
Processor loads and executes institute in the method to realize the display web page contents as described in claims 1 to 10 any claim
The operation of execution, or realize that operation performed in the method for web page contents is collected in display as claimed in claim 11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711202503.7A CN109948095B (en) | 2017-11-27 | 2017-11-27 | Method, device, terminal and storage medium for displaying webpage content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711202503.7A CN109948095B (en) | 2017-11-27 | 2017-11-27 | Method, device, terminal and storage medium for displaying webpage content |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109948095A true CN109948095A (en) | 2019-06-28 |
CN109948095B CN109948095B (en) | 2022-09-30 |
Family
ID=67003973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711202503.7A Active CN109948095B (en) | 2017-11-27 | 2017-11-27 | Method, device, terminal and storage medium for displaying webpage content |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109948095B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114020987A (en) * | 2022-01-06 | 2022-02-08 | 北京微步在线科技有限公司 | Sample data acquisition method, device, equipment and storage medium based on webpage |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477564A (en) * | 2009-01-21 | 2009-07-08 | 北京千家悦网络科技有限公司 | Intelligent layout method for displaying wide web page on narrow-screen equipment |
CN103150389A (en) * | 2013-03-21 | 2013-06-12 | 北京奇虎科技有限公司 | Method and device for processing matching setting of webpage text contents |
CN103345532A (en) * | 2013-07-26 | 2013-10-09 | 人民搜索网络股份公司 | Method and device for extracting webpage information |
CN106095985A (en) * | 2016-06-20 | 2016-11-09 | 网际傲游(北京)科技有限公司 | A kind of dynamic collection the method for cluster web pages information |
US20170052994A1 (en) * | 2015-08-18 | 2017-02-23 | Samsung Electronics Co., Ltd. | Method and system for bookmarking a webpage |
CN107329985A (en) * | 2017-05-31 | 2017-11-07 | 北京安云世纪科技有限公司 | A kind of collecting method of the page, device and mobile terminal |
-
2017
- 2017-11-27 CN CN201711202503.7A patent/CN109948095B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101477564A (en) * | 2009-01-21 | 2009-07-08 | 北京千家悦网络科技有限公司 | Intelligent layout method for displaying wide web page on narrow-screen equipment |
CN103150389A (en) * | 2013-03-21 | 2013-06-12 | 北京奇虎科技有限公司 | Method and device for processing matching setting of webpage text contents |
CN103345532A (en) * | 2013-07-26 | 2013-10-09 | 人民搜索网络股份公司 | Method and device for extracting webpage information |
US20170052994A1 (en) * | 2015-08-18 | 2017-02-23 | Samsung Electronics Co., Ltd. | Method and system for bookmarking a webpage |
CN106095985A (en) * | 2016-06-20 | 2016-11-09 | 网际傲游(北京)科技有限公司 | A kind of dynamic collection the method for cluster web pages information |
CN107329985A (en) * | 2017-05-31 | 2017-11-07 | 北京安云世纪科技有限公司 | A kind of collecting method of the page, device and mobile terminal |
Non-Patent Citations (2)
Title |
---|
KAMPS J: "Language Models for Searching in Web Corpora", 《THIRTEENTH TEXT RETRIEVAL CONFERENCE》 * |
孙莉娜: "基于超链接信息的Web文本聚类方法研究", 《电脑知识与技术(学术交流)》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114020987A (en) * | 2022-01-06 | 2022-02-08 | 北京微步在线科技有限公司 | Sample data acquisition method, device, equipment and storage medium based on webpage |
Also Published As
Publication number | Publication date |
---|---|
CN109948095B (en) | 2022-09-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11250090B2 (en) | Recommended content display method, device, and system | |
CN110199276A (en) | With aspect tag query | |
CN111064655B (en) | Template message pushing method, device, equipment and storage medium | |
US11853730B2 (en) | Mini program data binding method and apparatus, device, and storage medium | |
CN108920515A (en) | Information recommendation method, device, equipment and the storage medium of web displaying process | |
CN109918669A (en) | Entity determines method, apparatus and storage medium | |
CN110502308A (en) | Style sheet switching method, device, computer equipment and storage medium | |
CN111435377B (en) | Application recommendation method, device, electronic equipment and storage medium | |
CN109522146A (en) | The method, apparatus and storage medium of abnormality test are carried out to client | |
CN112464052A (en) | Feedback information processing method, feedback information display device and electronic equipment | |
CN109413098A (en) | Method, apparatus, terminal and the storage medium that web page element is shown | |
CN110555102A (en) | media title recognition method, device and storage medium | |
CN110245291A (en) | A kind of display methods of business datum, device, computer equipment and storage medium | |
CN110502290A (en) | Interface display method, device, display equipment and storage medium | |
CN109933594A (en) | Obtain method, apparatus, electronic equipment and the medium of data | |
CN109726379A (en) | Content item edit methods, device, electronic equipment and storage medium | |
CN110149408B (en) | Service data display method and device, terminal and server | |
CN109995804A (en) | Display methods, information providing method and the device of target resource information | |
CN111028071B (en) | Bill processing method and device, electronic equipment and storage medium | |
CN109948095A (en) | Show method, apparatus, terminal and the storage medium of web page contents | |
CN111428162A (en) | Page screenshot method and device | |
CN113987326B (en) | Resource recommendation method and device, computer equipment and medium | |
WO2022033432A1 (en) | Content recommendation method, electronic device and server | |
CN106776634A (en) | A kind of method for network access, device and terminal device | |
CN109902089A (en) | Querying method, device, electronic equipment and the medium indexed using isomery |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |