CN109255088A - Web data monitoring method and equipment - Google Patents
Web data monitoring method and equipment Download PDFInfo
- Publication number
- CN109255088A CN109255088A CN201710552265.6A CN201710552265A CN109255088A CN 109255088 A CN109255088 A CN 109255088A CN 201710552265 A CN201710552265 A CN 201710552265A CN 109255088 A CN109255088 A CN 109255088A
- Authority
- CN
- China
- Prior art keywords
- source code
- node
- dom tree
- webpage source
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides web data monitoring method and equipment, and a kind of memory consumption is small, and monitoring efficiency is high for solving to need to provide, and the problem of monitor process simple monitoring method.Wherein method includes receiving the first webpage source code;Parse the dom tree that the first webpage source code obtains the first webpage source code;When according to the dom tree of the first webpage source code of traversal, the access order of label node x obtains the corresponding node data y of the dom tree of the second webpage source code from graphic data base;Compare label node x data and the corresponding node data y, obtain comparison result;Compare all label nodes of the dom tree of the first webpage source code until traversing;Method of the invention saves memory consumption.
Description
Technical field
The present invention relates to computer technologies, and in particular to web data monitoring method and equipment.
Background technique
DOM is the standard of W3C, and DOM defines the standard of access HTML and XML document.HTML DOM defines all
The object and attribute of HTML element, and the method for accessing them.Such as following one section of webpage source code:
Wherein html is webpage root node, and ' lang=" en " ', as the attribute of root node, Head, body are html
Two child nodes.Constantly circulation is gone down in this way, can by the logical relation of entire webpage and nodal community, content presentation at
The form of tree, as shown in Figure 1.
Content to compare which specific node in webpage source code changes, or judges whether structure of web page becomes
Change, which further part structure of web page changes, then needs by parsing webpage source code according to the syntax rule of HTML
The webpage source code, further compares content and structure, and the parsing of webpage source code consumes memory, thus this comparative approach very much
And it is of little use.
In the prior art, another data obtained to the monitoring of webpage based on analyzing web page, i.e., deposited the data of acquisition
It is stored in the table of two-dimentional relation, by comparing two webpages corresponding record in bivariate table, judges whether web page contents occur
Variation.And since the page structure of same website is variation (i.e. Web Page Layout topology update, such as TV play before the update
Other TV play lists for thering is featured performer to take part in a performance in details page, and do not have then after updating), lead to the bivariate table being pre-designed
Database may be not suitable for the webpage after storage organization variation, and need to design new table for new webpage and store in the webpage
The data of acquisition, to monitor the variation of the web page contents.
In the prior art frequently with addition table or the mode of update literary name section, the record of new web page is solved the problems, such as, such as newly
The display area for increasing performer in webpage newly, then increase actor fields and newly-built cast, for recording this in corresponding table
The data content in newly-increased region.
As can seen above since the structure of webpage may be extremely complex, and cause the association of table also very complicated, and the association of table
Relationship often lacks paper trail, this often leads to be difficult to efficiently to construct the above-mentioned collected data of search type search.And
Since structure of web page changes, need to update the structure of these tables, or when addition relation table, tend to malfunction.I.e. due to table
Relationship or structure error, to be difficult to ensure the consistency of data meaning in the meaning and webpage for the data being recorded in table, most
The failure for causing webpage to monitor eventually.
Accordingly, it is desirable to provide a kind of simply not error-prone monitoring method of monitoring process.
Summary of the invention
In view of the above problems, the invention proposes overcome the above problem or at least be partially solved the webpage of the above problem
Data monitoring method and equipment.
In a first aspect, the present invention provides a kind of web data monitoring method, comprising: it is corresponding to compare the first webpage source code
Dom tree node data corresponding with the dom tree of the second webpage source code obtains webpage monitoring result;
Wherein the corresponding dom tree of the second webpage source code data is stored in graphic data base.
Optionally, the corresponding dom tree of first webpage source code of comparison node corresponding with the dom tree of the second webpage source code
Data obtain webpage monitoring result, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
When according to the dom tree of the first webpage source code of traversal, the access order of label node x is obtained from graphic data base
Obtain the corresponding node data y of the dom tree of the second webpage source code;
Compare label node x data and the corresponding node data y, obtain comparison result;
All label nodes until traversing the dom tree of the first webpage source code;
Optionally, the corresponding dom tree of first webpage source code of comparison node corresponding with the dom tree of the second webpage source code
Data obtain webpage monitoring result, comprising: the first webpage source code of parsing obtains the dom tree of the first webpage source code;
The node data y of the dom tree of the second webpage source code is obtained from graphic data base;
According to the relationship of graphic data base interior joint data and node data, pair of the dom tree of the first webpage source code is obtained
Answer the data of label node x;
Compare label node x data and the corresponding node data y, obtain comparison result;
All node datas until traversing the dom tree of the second webpage source code.
Optionally, the mode pair of the first webpage source code of mode and traversal of the second network source code is stored in graphic data base
It answers.
Optionally, the graphic data base is Neo4j.
Optionally, before the first webpage source code of the receiving, further includes:
The dom tree for traversing the second webpage source code obtains the root node of dom tree to the node data of leaf node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal.
Optionally, node data includes the set membership of the nodename of node, nodal community, node content and node.
Second aspect, the present invention provide a kind of computer equipment, comprising: memory, processor and are stored in described deposit
On reservoir and the computer program that can execute on the processor, the processor are realized as above any when executing described program
The step of the method.
The third aspect, the present invention provide a kind of computer readable storage medium, are stored thereon with computer program, feature
The step of being, as above any the method realized when which is executed by processor.
Fourth aspect, the present invention provide a kind of equipment,
Including comparison module, object module, memory module;
The comparison module compares each of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code
Node data,
The object module is used for, and obtains webpage monitoring result;
The memory module stores the second webpage source code pair for the structure according to the corresponding dom tree of the second webpage source code
The dom tree answered is in graphic data base.
Graphic data base of the present invention can recorde DOM tree structure, and the mode compared to bivariate table monitors webpage, then does not need
Bivariate table, therefore simple more difficult error are constructed according to new structure of web page.
Front is to provide the simplified summary of the understanding to some aspects of the present invention.This part neither the present invention and
The detailed statement of its various embodiment is also not the statement of exhaustion.Its neither important or key feature of the invention for identification
Do not limit the scope of the invention, but provide selected principle of the invention with a kind of reduced form, as to it is given below more
The brief introduction specifically described.It should be appreciated that either alone or in combination using one for being set forth above or being detailed below or
Multiple features, other embodiments of the invention are also possible.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is DOM tree structure schematic diagram in the prior art;
Fig. 2 is to execute method flow schematic diagram in one embodiment of the present of invention.
Specific embodiment
The present invention is described below in conjunction with illustrative communication system.
For this purpose, web data monitoring method, comprising:
It include: each number of nodes for comparing the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code
According to acquisition webpage monitoring result;
Wherein each node data of the corresponding dom tree of the second webpage source code is according to the corresponding dom tree of the second webpage source code
Structure be stored in graphic data base.
Graphic data base is a kind of non-relational database, the relation information between its Graphics Application theory storage entity,
I.e. in webpage monitoring process, without being closed according to the two dimension that new structure of web page constructs multiple complexity after structure of web page variation
It is table, to monitor the data in new structure of web page.
Graphic data base of the present invention can recorde DOM tree structure, and therefore, the present invention compares net compared to the mode of bivariate table
Content in page does not need then to construct bivariate table according to new structure of web page, therefore simple more difficult error.
As shown in Fig. 2, in one embodiment of the invention, web data monitoring method, comprising:
S101 receives the first webpage source code;
S102 parses the dom tree that the first webpage source code obtains the first webpage source code;
When S103 is according to the dom tree for traversing the first webpage source code, the access order of label node x, from graphic data base
The corresponding node data y of the middle dom tree for obtaining the second webpage source code;
S104 compare label node x data and the corresponding node data y, obtain comparison result;Until traversal first
All label nodes of the dom tree of webpage source code;
In one embodiment of the invention, the second webpage source code is the source code of history web pages, and the first webpage is new web page
Source code;In this embodiment, whether the interior perhaps structure for needing to monitor webpage changes.
Label refers to that the label of HTML, node refer to the node in dom tree.Analyzing web page source code, it is raw according to label
At correspondence dom tree, node, that is, label node in dom tree.Label node x refers to a certain label node.It can be understood that
If the result of the second webpage source code is identical with the structure of the first webpage source code, the first webpage source code is traversed using identical method
Dom tree and the second webpage source code of traversal are stored in the dom tree of graphic data base, can obtain by same names node configuration node chain
(i.e. Nodelist).
In one embodiment of the invention, the second webpage source code is stored in graphic data base in a manner of Nodelist
In, it can according to the sequence of the interior joint of Nodelist, quickly from being obtained in graphic data base in the second webpage source code
Node.It is understood that there are many implementations for graphic data base, such as a kind of text database of designed, designed is for storing
The relationship of above-mentioned Nodelist and Nodelist and corresponding webpage source code keyword;It is also possible to based on existing figure number
It is realized according to library.The traversal can be extreme saturation alternatively breadth traversal.
Html tag can possess attribute, and attribute provides more information in relation to HTML element.Attribute is always with name
Title/value pair form occurs, such as: name=" value ", it is always provided in the beginning label of HTML element.Attribute
Attr object represents the attribute in some label.The Attr of sub- Node is inherited in the Attr of father Node, but because Attr is real
It is included in Element on border, is not intended as an individual node in dom tree and occurs, be not one of dom tree
Point.It is thus impossible to Attr is obtained using the method for obtaining node, it when in use will be with other node Node subobjects
It distinguishes.The present invention supports the complete list of legal attribute workable for each element of HTML.More detailed contents can refer to
Definition and resolution rules of the HTML to attribute.
Definition ' parent-children's-N ' relationship between node and node, wherein N is expressed as n-th that child node is father node
Node.While storage, build path: from the root node of dom tree, according to the relationship of child node and father node, and
It is subordinated to relationship between the child node of same father node, along relationship and its direction, traverses the node of dom tree, i.e., is saved from starting
Point navigates to end node, the sequential combination i.e. group for traversing through all node and relationship becomes along relationship step by step
Path.
The acquisition time interval of second webpage and the first webpage be it is preset, i.e. the time interval of web monitor is default
's.
Node herein refers to the formed node of html tag in node namely webpage in dom tree
In one embodiment of the invention, according to the dom tree knot interior joint of webpage source code and the relationship of node, by webpage
In the data storage of source code and graphic data base.When whether monitoring web page contents change, the DOM of the first webpage need to be only traversed
Tree reads corresponding second according to the sequence of accessed node when the first webpage source code dom tree of traversal from graphic data base one by one
The corresponding node data of webpage source code relatively know whether the content of monitoring net page changes.
Since the second webpage is that the relationship between label node according to dom tree stores the second web page source in graph data
Code, therefore according to traversal the first webpage source code dom tree when accessed node sequence, one by one from graphic data base read pair
When the corresponding node data for the second webpage source code answered, the speed for reading corresponding node data is fast.If can not find corresponding node
Data are then that the first webpage source code is changed relative to the structure of web page of the second webpage source code, and changed part can
It is indicated with the subtree corresponding to the node compared at this time.
It can be understood that need to rebuild the dom tree of the second webpage when comparing compared to web page text, it is of the invention
Method saves memory consumption, and the data for obtaining the second webpage corresponding node data are fast.And compared to the mode of bivariate table ratio
Compared with the content in webpage, then do not need to construct bivariate table according to new structure of web page, therefore simple more difficult error.
Comparison result may be there are three types of type, the first is not the result is that change, and second the result is that node data
Content changed, the third is the result is that the structure of dom tree is changed.
If the content of node data is changed, new webpage source code is stored in graphic data base, and is marked
The data content of variation;
If the structure of dom tree changes, by new webpage source code storage with graphic data base, and variation is marked
Subtree.The subtree of label variation can be the root node of the philosophical works of label variation.
According to storage in graphic data base, can thus when needing to obtain from database the data of the second webpage source code
Quickly to obtain the second whole webpage source codes.
In one embodiment of this invention, web data monitoring method, comprising:
S111 receives the first webpage source code;
S112 parses the dom tree that the first webpage source code obtains the first webpage source code;
S113 obtains the node data y of the dom tree of the second webpage source code from graphic data base;
S114 obtains the dom tree of the first webpage source code according to the relationship of graphic data base interior joint data and node data
Corresponding label node x data;
S115 compare label node x data and the corresponding node data y, obtain comparison result;Until traversal second
All node datas of the dom tree of webpage source code.
Be understood that due to graphic data base characteristic, in the DOM for obtaining the second webpage source code from graphic data base
When the node data y of tree, that is, the relationship of node data y Yu other node datas, such as the pass of itself and dom tree root node are obtained
System, being found from the first webpage source code according to the relationship of remaining dom tree root node has identical corresponding pass with its dom tree root node
The label node of system.
In one embodiment of the invention, the mode and traversal first of the second network source code are stored in graphic data base
The mode of webpage source code is corresponding.The corresponding dom tree of the second webpage source code is traversed using method A, it is corresponding to obtain by traversal order
Nodelist is stored in graphic data base by Nodelist;And when detecting webpage, the first net is traversed using same method A
Page, and read the corresponding Nodelist of the second webpage source code in graphic data base one by one during traversal.
It is Neo4j in the graphic data base in one embodiment of the present of invention.Most basic concept is section in Neo4j
Point (node) and relationship (relationship).Node presentation-entity is indicated by org.neo4j.graphdb.Node interface.
Between the two nodes, there can be different relationships.Relationship is by org.neo4j.graphdb.Relationship interface come table
Show.Each relationship is made of three elements such as start node, terminal node and type.The presence of start node and terminal node,
The relationship of illustrating is that have direction, similar to the side in digraph.It is obtained using NodeList object according to the traversal order of dom tree
Node in dom tree.When obtaining node, available dom tree kind nodename, nodal community, node content and node
The set membership group of nodename, nodal community, node content and node is combined into a node data by set membership etc.,
As storing Node in NodeList.
Nodename refers to the title of dom tree interior joint, and nodal community refers to the attribute of dom tree interior joint.Node content,
It refers to the value of node, or refers to the content between start-tag and end-tag.
In one embodiment of this paper, the second webpage source code can be stored by following methods:
Receive the second webpage source code;
The dom tree for traversing the second webpage source code, obtain the root node of dom tree to the nodename of leaf node, nodal community,
The set membership of node content and node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal;Section
Point data includes the set membership of the nodename of node, nodal community, node content and node.
The present invention also provides a kind of computer equipments, comprising: memory, processor and is stored on the memory simultaneously
The computer program that can be executed on the processor, the processor realize as above any the method when executing described program
The step of.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is processed
The step of as above any the method is realized when device executes.
The present invention also provides a kind of equipment, including comparison module, object module, memory modules;
The comparison module compares each of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code
Node data,
The object module is used for, and obtains webpage monitoring result;
The memory module stores the second webpage source code pair for the structure according to the corresponding dom tree of the second webpage source code
The dom tree answered is in graphic data base.
"at least one" used herein, " one or more " and "and/or" are open statements, when in use
It can be united and separation.For example, " at least one of A, B and C ", " at least one of A, B or C ", " in A, B and C
One or more " and " one or more of A, B or C " refer to only A, only B, only C, A and B together, A and C together,
B and C together or A, B and C together.
"one" entity of term refers to one or more entities.Thus term "one", " one or more " and " extremely
Few one " be herein defined as may be used interchangeably.It should also be noted that the terms "include", "comprise" and " having " are also can be mutual
It changes and uses.
Term " automatic " used herein and its modification refer to do not have when executing processing or operation it is tangible artificial
Any processing or operation completed in the case where input.However, even if having used the execution place when executing processing or operation
The essence received before reason or operation or immaterial artificial input, the processing or operation are also possible to automatically.If
Input influences how the processing or operation will carry out, then is substantive depending on the artificial input.The processing or operation are not influenced
The artificial input carried out is not to be taken as substantive.
Term " computer-readable medium " used herein refers to that participation provides instructions to any of processor execution
Tangible storage device and/or transmission medium.Computer-readable medium can be in network transmission (such as SOAP) on ip networks
The serial command collection of coding.Such medium can take many forms, and including but not limited to non-volatile media, volatibility is situated between
Matter and transmission medium.Non-volatile media disk including such as NVRAM or magnetically or optically.Volatile media includes such as main memory
Dynamic memory (such as RAM).The common form of computer-readable medium includes such as floppy disk, flexible disk, hard disk, tape or appoints
What its magnetic medium, magnet-optical medium, CD-ROM, any other optical medium, punched card, paper tape, it is any other have hole shape pattern
Physical medium, RAM, PROM, EPROM, FLASH-EPROM, the solid state medium of such as storage card, any other storage chip or
Any other medium that cassette, the carrier wave described below or computer can be read.The digital file attachment of Email or
Other self-contained news files or archive set are considered as the distribution medium for being equivalent to tangible media.Work as computer-readable medium
When being configured as database, it should be appreciated that the database can be any kind of database, such as relational database, number of levels
According to library, OODB Object Oriented Data Base etc..Correspondingly, it is believed that the present invention includes tangible media or distribution medium and existing skill
Equivalent well known to art and the medium of the following exploitation, store software implementation of the invention in these media.
Term " determination ", " operation " and " calculating " used herein and its modification may be used interchangeably, and including appointing
Method, processing, mathematical operation or the technology of what type.More specifically, such term may include the explanation rule of such as BPEL
Then or rule language, wherein logic is not hard coded but can be by table in the rule file of reading, explanation, compiling and execution
Show.
Term " module " used herein or " tool " refer to hardware that is any of or developing later, software, consolidate
Part, artificial intelligence, fuzzy logic or be able to carry out function relevant to the element hardware and software combination.In addition, though
The present invention is described with illustrative embodiments, it is to be understood that each aspect of the present invention can individually be claimed.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or the terminal device that include a series of elements not only include those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or end
The intrinsic element of end equipment.In the absence of more restrictions, being limited by sentence " including ... " or " including ... "
Element, it is not excluded that there is also other elements in process, method, article or the terminal device for including the element.This
Outside, herein, " being greater than ", " being less than ", " being more than " etc. are interpreted as not including this number;" more than ", " following ", " within " etc. understand
Being includes this number.
Although the various embodiments described above are described, once a person skilled in the art knows basic wounds
The property made concept, then additional changes and modifications can be made to these embodiments, so the above description is only an embodiment of the present invention,
It is not intended to limit scope of patent protection of the invention, it is all to utilize equivalent structure made by description of the invention and accompanying drawing content
Or equivalent process transformation, being applied directly or indirectly in other relevant technical fields, similarly includes in patent of the invention
Within protection scope.
Claims (10)
1. web data monitoring method characterized by comprising
Compare each node data of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code, obtains webpage
Monitoring result;
Wherein each node data of the corresponding dom tree of the second webpage source code is the knot according to the corresponding dom tree of the second webpage source code
Structure is stored in graphic data base.
2. the method according to claim 1, which is characterized in that the corresponding dom tree of first webpage source code of comparison and the second net
Each node data of the corresponding dom tree of page source code, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
When according to the dom tree of the first webpage source code of traversal, the access order of label node x obtains the from graphic data base
The corresponding node data y of the dom tree of two webpage source codes;
Compare label node x data and the corresponding node data y,
Compare all label nodes of the dom tree of the first webpage source code until traversing.
3. the method according to claim 1, which is characterized in that compare the corresponding dom tree of the first webpage source code and the second web page source
Each node data of the corresponding dom tree of code, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
The node data y of the dom tree of the second webpage source code is obtained from graphic data base;
According to the relationship of graphic data base interior joint data and node data, the corresponding mark of the dom tree of the first webpage source code is obtained
Sign the data of node x;
Compare label node x data and the corresponding node data y;
Compare all node datas of the dom tree of the second webpage source code until traversing.
4. the method according to claim 1, which is characterized in that store the corresponding DOM of the second webpage source code in graphic data base
The mode of each node data of tree is corresponding with the traversal mode of the first webpage source code.
5. the method according to claim 1, wherein the graphic data base is Neo4j.
6. the method according to claim 1, wherein before the first webpage source code of the receiving, further includes:
The dom tree for traversing the second webpage source code obtains the root node of dom tree to the node data of leaf node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal.
7. the method according to claim 1, wherein node data include the nodename of node, nodal community,
The set membership of node content and node.
8. a kind of computer equipment, comprising: memory, processor and be stored on the memory and can be in the processor
The computer program of upper execution, which is characterized in that the processor is realized when executing described program such as any institute of claim 1-7
The step of stating method.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor
The step of the method as any such as claim 1-7 is realized when row.
10. a kind of equipment, which is characterized in that including comparison module, object module, memory module;
The comparison module compares each node of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code
Data,
The object module is used for, and obtains webpage monitoring result;
It is corresponding to store the second webpage source code for the structure according to the corresponding dom tree of the second webpage source code for the memory module
Dom tree is in graphic data base.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710552265.6A CN109255088A (en) | 2017-07-07 | 2017-07-07 | Web data monitoring method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710552265.6A CN109255088A (en) | 2017-07-07 | 2017-07-07 | Web data monitoring method and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109255088A true CN109255088A (en) | 2019-01-22 |
Family
ID=65050920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710552265.6A Pending CN109255088A (en) | 2017-07-07 | 2017-07-07 | Web data monitoring method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109255088A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859231A (en) * | 2019-04-30 | 2020-10-30 | 中移(苏州)软件技术有限公司 | Webpage monitoring method, equipment, device and computer storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050039117A1 (en) * | 2003-08-15 | 2005-02-17 | Fuhwei Lwo | Method, system, and computer program product for comparing two computer files |
CN101184105A (en) * | 2006-11-18 | 2008-05-21 | 国际商业机器公司 | Client appartus for updating data |
CN102129428A (en) * | 2010-01-20 | 2011-07-20 | 腾讯科技(深圳)有限公司 | Method and device for subscribing information from webpage |
CN102193990A (en) * | 2011-03-25 | 2011-09-21 | 北京世纪互联工程技术服务有限公司 | Pattern database and realization method thereof |
CN102682098A (en) * | 2012-04-27 | 2012-09-19 | 北京神州绿盟信息安全科技股份有限公司 | Method and device for detecting web page content changes |
CN103577526A (en) * | 2013-08-01 | 2014-02-12 | 星云融创(北京)信息技术有限公司 | Method and system as well as browser for verifying page modification |
CN104391964A (en) * | 2014-12-01 | 2015-03-04 | 南京大学 | Method for storing source codes into graph database |
CN105630902A (en) * | 2015-12-21 | 2016-06-01 | 明博教育科技股份有限公司 | Method for rendering and incrementally updating webpages |
-
2017
- 2017-07-07 CN CN201710552265.6A patent/CN109255088A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050039117A1 (en) * | 2003-08-15 | 2005-02-17 | Fuhwei Lwo | Method, system, and computer program product for comparing two computer files |
CN101184105A (en) * | 2006-11-18 | 2008-05-21 | 国际商业机器公司 | Client appartus for updating data |
CN102129428A (en) * | 2010-01-20 | 2011-07-20 | 腾讯科技(深圳)有限公司 | Method and device for subscribing information from webpage |
CN102193990A (en) * | 2011-03-25 | 2011-09-21 | 北京世纪互联工程技术服务有限公司 | Pattern database and realization method thereof |
CN102682098A (en) * | 2012-04-27 | 2012-09-19 | 北京神州绿盟信息安全科技股份有限公司 | Method and device for detecting web page content changes |
CN103577526A (en) * | 2013-08-01 | 2014-02-12 | 星云融创(北京)信息技术有限公司 | Method and system as well as browser for verifying page modification |
CN104391964A (en) * | 2014-12-01 | 2015-03-04 | 南京大学 | Method for storing source codes into graph database |
CN105630902A (en) * | 2015-12-21 | 2016-06-01 | 明博教育科技股份有限公司 | Method for rendering and incrementally updating webpages |
Non-Patent Citations (1)
Title |
---|
胡昌龙: "《虚拟社会网络下群行为感知与规律研究》", 30 November 2016, 武汉大学出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859231A (en) * | 2019-04-30 | 2020-10-30 | 中移(苏州)软件技术有限公司 | Webpage monitoring method, equipment, device and computer storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Berlingerio et al. | Abacus: frequent pattern mining-based community discovery in multidimensional networks | |
Goasdoué et al. | RDF graph summarization for first-sight structure discovery | |
Tekli et al. | A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics | |
Dodds et al. | Linked data patterns | |
JP2005092889A (en) | Information block extraction apparatus and method for web page | |
CN103177094A (en) | Cleaning method of data of internet of things | |
Grandi | Dynamic class hierarchy management for multi-version ontology-based personalization | |
Di Iorio et al. | Dealing with structural patterns of XML documents | |
Kiu et al. | TaxoFolk: a hybrid taxonomy–folksonomy classification for enhanced knowledge navigation | |
Cole et al. | Suffix trays and suffix trists: Structures for faster text indexing | |
US9524351B2 (en) | Requesting, responding and parsing | |
CN109255088A (en) | Web data monitoring method and equipment | |
Asghari et al. | XML document clustering: techniques and challenges | |
KR101380605B1 (en) | A Hypergraph-based Storage Method for Managing RDF Version | |
Lim et al. | Generalized and lightweight algorithms for automated web forum content extraction | |
Murolo et al. | Revisiting web data extraction using in-browser structural analysis and visual cues in modern web designs | |
Manica et al. | Orion: A cypher-based web data extractor | |
Gayoso-Cabada et al. | Learning object repositories with dynamically reconfigurable metadata schemata | |
Zheng et al. | Research on the Application of XML in Fault Diagnosis IETM | |
Faheem | Intelligent content acquisition in Web archiving | |
Mattam et al. | A Framework for Knowledgebase Curation using Cognitive Web Architecture | |
Lohmann | Conceptualization and visualization of tagging and folksonomies | |
Keegan et al. | Analyzing multi-dimensional networks within mediawikis | |
Jayanthi et al. | Referenced attribute Functional Dependency Database for visualizing web relational tables | |
Ciaccia et al. | The collection index to support complex approximate queries |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190122 |