CN109255088A - Web data monitoring method and equipment - Google Patents

Web data monitoring method and equipment Download PDF

Info

Publication number
CN109255088A
CN109255088A CN201710552265.6A CN201710552265A CN109255088A CN 109255088 A CN109255088 A CN 109255088A CN 201710552265 A CN201710552265 A CN 201710552265A CN 109255088 A CN109255088 A CN 109255088A
Authority
CN
China
Prior art keywords
source code
node
dom tree
webpage source
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710552265.6A
Other languages
Chinese (zh)
Inventor
张春荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Potevio Information Technology Co Ltd
Putian Information Technology Co Ltd
Original Assignee
Putian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Putian Information Technology Co Ltd filed Critical Putian Information Technology Co Ltd
Priority to CN201710552265.6A priority Critical patent/CN109255088A/en
Publication of CN109255088A publication Critical patent/CN109255088A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides web data monitoring method and equipment, and a kind of memory consumption is small, and monitoring efficiency is high for solving to need to provide, and the problem of monitor process simple monitoring method.Wherein method includes receiving the first webpage source code;Parse the dom tree that the first webpage source code obtains the first webpage source code;When according to the dom tree of the first webpage source code of traversal, the access order of label node x obtains the corresponding node data y of the dom tree of the second webpage source code from graphic data base;Compare label node x data and the corresponding node data y, obtain comparison result;Compare all label nodes of the dom tree of the first webpage source code until traversing;Method of the invention saves memory consumption.

Description

Web data monitoring method and equipment
Technical field
The present invention relates to computer technologies, and in particular to web data monitoring method and equipment.
Background technique
DOM is the standard of W3C, and DOM defines the standard of access HTML and XML document.HTML DOM defines all The object and attribute of HTML element, and the method for accessing them.Such as following one section of webpage source code:
Wherein html is webpage root node, and ' lang=" en " ', as the attribute of root node, Head, body are html Two child nodes.Constantly circulation is gone down in this way, can by the logical relation of entire webpage and nodal community, content presentation at The form of tree, as shown in Figure 1.
Content to compare which specific node in webpage source code changes, or judges whether structure of web page becomes Change, which further part structure of web page changes, then needs by parsing webpage source code according to the syntax rule of HTML The webpage source code, further compares content and structure, and the parsing of webpage source code consumes memory, thus this comparative approach very much And it is of little use.
In the prior art, another data obtained to the monitoring of webpage based on analyzing web page, i.e., deposited the data of acquisition It is stored in the table of two-dimentional relation, by comparing two webpages corresponding record in bivariate table, judges whether web page contents occur Variation.And since the page structure of same website is variation (i.e. Web Page Layout topology update, such as TV play before the update Other TV play lists for thering is featured performer to take part in a performance in details page, and do not have then after updating), lead to the bivariate table being pre-designed Database may be not suitable for the webpage after storage organization variation, and need to design new table for new webpage and store in the webpage The data of acquisition, to monitor the variation of the web page contents.
In the prior art frequently with addition table or the mode of update literary name section, the record of new web page is solved the problems, such as, such as newly The display area for increasing performer in webpage newly, then increase actor fields and newly-built cast, for recording this in corresponding table The data content in newly-increased region.
As can seen above since the structure of webpage may be extremely complex, and cause the association of table also very complicated, and the association of table Relationship often lacks paper trail, this often leads to be difficult to efficiently to construct the above-mentioned collected data of search type search.And Since structure of web page changes, need to update the structure of these tables, or when addition relation table, tend to malfunction.I.e. due to table Relationship or structure error, to be difficult to ensure the consistency of data meaning in the meaning and webpage for the data being recorded in table, most The failure for causing webpage to monitor eventually.
Accordingly, it is desirable to provide a kind of simply not error-prone monitoring method of monitoring process.
Summary of the invention
In view of the above problems, the invention proposes overcome the above problem or at least be partially solved the webpage of the above problem Data monitoring method and equipment.
In a first aspect, the present invention provides a kind of web data monitoring method, comprising: it is corresponding to compare the first webpage source code Dom tree node data corresponding with the dom tree of the second webpage source code obtains webpage monitoring result;
Wherein the corresponding dom tree of the second webpage source code data is stored in graphic data base.
Optionally, the corresponding dom tree of first webpage source code of comparison node corresponding with the dom tree of the second webpage source code Data obtain webpage monitoring result, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
When according to the dom tree of the first webpage source code of traversal, the access order of label node x is obtained from graphic data base Obtain the corresponding node data y of the dom tree of the second webpage source code;
Compare label node x data and the corresponding node data y, obtain comparison result;
All label nodes until traversing the dom tree of the first webpage source code;
Optionally, the corresponding dom tree of first webpage source code of comparison node corresponding with the dom tree of the second webpage source code Data obtain webpage monitoring result, comprising: the first webpage source code of parsing obtains the dom tree of the first webpage source code;
The node data y of the dom tree of the second webpage source code is obtained from graphic data base;
According to the relationship of graphic data base interior joint data and node data, pair of the dom tree of the first webpage source code is obtained Answer the data of label node x;
Compare label node x data and the corresponding node data y, obtain comparison result;
All node datas until traversing the dom tree of the second webpage source code.
Optionally, the mode pair of the first webpage source code of mode and traversal of the second network source code is stored in graphic data base It answers.
Optionally, the graphic data base is Neo4j.
Optionally, before the first webpage source code of the receiving, further includes:
The dom tree for traversing the second webpage source code obtains the root node of dom tree to the node data of leaf node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal.
Optionally, node data includes the set membership of the nodename of node, nodal community, node content and node.
Second aspect, the present invention provide a kind of computer equipment, comprising: memory, processor and are stored in described deposit On reservoir and the computer program that can execute on the processor, the processor are realized as above any when executing described program The step of the method.
The third aspect, the present invention provide a kind of computer readable storage medium, are stored thereon with computer program, feature The step of being, as above any the method realized when which is executed by processor.
Fourth aspect, the present invention provide a kind of equipment,
Including comparison module, object module, memory module;
The comparison module compares each of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code Node data,
The object module is used for, and obtains webpage monitoring result;
The memory module stores the second webpage source code pair for the structure according to the corresponding dom tree of the second webpage source code The dom tree answered is in graphic data base.
Graphic data base of the present invention can recorde DOM tree structure, and the mode compared to bivariate table monitors webpage, then does not need Bivariate table, therefore simple more difficult error are constructed according to new structure of web page.
Front is to provide the simplified summary of the understanding to some aspects of the present invention.This part neither the present invention and The detailed statement of its various embodiment is also not the statement of exhaustion.Its neither important or key feature of the invention for identification Do not limit the scope of the invention, but provide selected principle of the invention with a kind of reduced form, as to it is given below more The brief introduction specifically described.It should be appreciated that either alone or in combination using one for being set forth above or being detailed below or Multiple features, other embodiments of the invention are also possible.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is DOM tree structure schematic diagram in the prior art;
Fig. 2 is to execute method flow schematic diagram in one embodiment of the present of invention.
Specific embodiment
The present invention is described below in conjunction with illustrative communication system.
For this purpose, web data monitoring method, comprising:
It include: each number of nodes for comparing the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code According to acquisition webpage monitoring result;
Wherein each node data of the corresponding dom tree of the second webpage source code is according to the corresponding dom tree of the second webpage source code Structure be stored in graphic data base.
Graphic data base is a kind of non-relational database, the relation information between its Graphics Application theory storage entity, I.e. in webpage monitoring process, without being closed according to the two dimension that new structure of web page constructs multiple complexity after structure of web page variation It is table, to monitor the data in new structure of web page.
Graphic data base of the present invention can recorde DOM tree structure, and therefore, the present invention compares net compared to the mode of bivariate table Content in page does not need then to construct bivariate table according to new structure of web page, therefore simple more difficult error.
As shown in Fig. 2, in one embodiment of the invention, web data monitoring method, comprising:
S101 receives the first webpage source code;
S102 parses the dom tree that the first webpage source code obtains the first webpage source code;
When S103 is according to the dom tree for traversing the first webpage source code, the access order of label node x, from graphic data base The corresponding node data y of the middle dom tree for obtaining the second webpage source code;
S104 compare label node x data and the corresponding node data y, obtain comparison result;Until traversal first All label nodes of the dom tree of webpage source code;
In one embodiment of the invention, the second webpage source code is the source code of history web pages, and the first webpage is new web page Source code;In this embodiment, whether the interior perhaps structure for needing to monitor webpage changes.
Label refers to that the label of HTML, node refer to the node in dom tree.Analyzing web page source code, it is raw according to label At correspondence dom tree, node, that is, label node in dom tree.Label node x refers to a certain label node.It can be understood that If the result of the second webpage source code is identical with the structure of the first webpage source code, the first webpage source code is traversed using identical method Dom tree and the second webpage source code of traversal are stored in the dom tree of graphic data base, can obtain by same names node configuration node chain (i.e. Nodelist).
In one embodiment of the invention, the second webpage source code is stored in graphic data base in a manner of Nodelist In, it can according to the sequence of the interior joint of Nodelist, quickly from being obtained in graphic data base in the second webpage source code Node.It is understood that there are many implementations for graphic data base, such as a kind of text database of designed, designed is for storing The relationship of above-mentioned Nodelist and Nodelist and corresponding webpage source code keyword;It is also possible to based on existing figure number It is realized according to library.The traversal can be extreme saturation alternatively breadth traversal.
Html tag can possess attribute, and attribute provides more information in relation to HTML element.Attribute is always with name Title/value pair form occurs, such as: name=" value ", it is always provided in the beginning label of HTML element.Attribute Attr object represents the attribute in some label.The Attr of sub- Node is inherited in the Attr of father Node, but because Attr is real It is included in Element on border, is not intended as an individual node in dom tree and occurs, be not one of dom tree Point.It is thus impossible to Attr is obtained using the method for obtaining node, it when in use will be with other node Node subobjects It distinguishes.The present invention supports the complete list of legal attribute workable for each element of HTML.More detailed contents can refer to Definition and resolution rules of the HTML to attribute.
Definition ' parent-children's-N ' relationship between node and node, wherein N is expressed as n-th that child node is father node Node.While storage, build path: from the root node of dom tree, according to the relationship of child node and father node, and It is subordinated to relationship between the child node of same father node, along relationship and its direction, traverses the node of dom tree, i.e., is saved from starting Point navigates to end node, the sequential combination i.e. group for traversing through all node and relationship becomes along relationship step by step Path.
The acquisition time interval of second webpage and the first webpage be it is preset, i.e. the time interval of web monitor is default 's.
Node herein refers to the formed node of html tag in node namely webpage in dom tree
In one embodiment of the invention, according to the dom tree knot interior joint of webpage source code and the relationship of node, by webpage In the data storage of source code and graphic data base.When whether monitoring web page contents change, the DOM of the first webpage need to be only traversed Tree reads corresponding second according to the sequence of accessed node when the first webpage source code dom tree of traversal from graphic data base one by one The corresponding node data of webpage source code relatively know whether the content of monitoring net page changes.
Since the second webpage is that the relationship between label node according to dom tree stores the second web page source in graph data Code, therefore according to traversal the first webpage source code dom tree when accessed node sequence, one by one from graphic data base read pair When the corresponding node data for the second webpage source code answered, the speed for reading corresponding node data is fast.If can not find corresponding node Data are then that the first webpage source code is changed relative to the structure of web page of the second webpage source code, and changed part can It is indicated with the subtree corresponding to the node compared at this time.
It can be understood that need to rebuild the dom tree of the second webpage when comparing compared to web page text, it is of the invention Method saves memory consumption, and the data for obtaining the second webpage corresponding node data are fast.And compared to the mode of bivariate table ratio Compared with the content in webpage, then do not need to construct bivariate table according to new structure of web page, therefore simple more difficult error.
Comparison result may be there are three types of type, the first is not the result is that change, and second the result is that node data Content changed, the third is the result is that the structure of dom tree is changed.
If the content of node data is changed, new webpage source code is stored in graphic data base, and is marked The data content of variation;
If the structure of dom tree changes, by new webpage source code storage with graphic data base, and variation is marked Subtree.The subtree of label variation can be the root node of the philosophical works of label variation.
According to storage in graphic data base, can thus when needing to obtain from database the data of the second webpage source code Quickly to obtain the second whole webpage source codes.
In one embodiment of this invention, web data monitoring method, comprising:
S111 receives the first webpage source code;
S112 parses the dom tree that the first webpage source code obtains the first webpage source code;
S113 obtains the node data y of the dom tree of the second webpage source code from graphic data base;
S114 obtains the dom tree of the first webpage source code according to the relationship of graphic data base interior joint data and node data Corresponding label node x data;
S115 compare label node x data and the corresponding node data y, obtain comparison result;Until traversal second All node datas of the dom tree of webpage source code.
Be understood that due to graphic data base characteristic, in the DOM for obtaining the second webpage source code from graphic data base When the node data y of tree, that is, the relationship of node data y Yu other node datas, such as the pass of itself and dom tree root node are obtained System, being found from the first webpage source code according to the relationship of remaining dom tree root node has identical corresponding pass with its dom tree root node The label node of system.
In one embodiment of the invention, the mode and traversal first of the second network source code are stored in graphic data base The mode of webpage source code is corresponding.The corresponding dom tree of the second webpage source code is traversed using method A, it is corresponding to obtain by traversal order Nodelist is stored in graphic data base by Nodelist;And when detecting webpage, the first net is traversed using same method A Page, and read the corresponding Nodelist of the second webpage source code in graphic data base one by one during traversal.
It is Neo4j in the graphic data base in one embodiment of the present of invention.Most basic concept is section in Neo4j Point (node) and relationship (relationship).Node presentation-entity is indicated by org.neo4j.graphdb.Node interface. Between the two nodes, there can be different relationships.Relationship is by org.neo4j.graphdb.Relationship interface come table Show.Each relationship is made of three elements such as start node, terminal node and type.The presence of start node and terminal node, The relationship of illustrating is that have direction, similar to the side in digraph.It is obtained using NodeList object according to the traversal order of dom tree Node in dom tree.When obtaining node, available dom tree kind nodename, nodal community, node content and node The set membership group of nodename, nodal community, node content and node is combined into a node data by set membership etc., As storing Node in NodeList.
Nodename refers to the title of dom tree interior joint, and nodal community refers to the attribute of dom tree interior joint.Node content, It refers to the value of node, or refers to the content between start-tag and end-tag.
In one embodiment of this paper, the second webpage source code can be stored by following methods:
Receive the second webpage source code;
The dom tree for traversing the second webpage source code, obtain the root node of dom tree to the nodename of leaf node, nodal community, The set membership of node content and node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal;Section Point data includes the set membership of the nodename of node, nodal community, node content and node.
The present invention also provides a kind of computer equipments, comprising: memory, processor and is stored on the memory simultaneously The computer program that can be executed on the processor, the processor realize as above any the method when executing described program The step of.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is processed The step of as above any the method is realized when device executes.
The present invention also provides a kind of equipment, including comparison module, object module, memory modules;
The comparison module compares each of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code Node data,
The object module is used for, and obtains webpage monitoring result;
The memory module stores the second webpage source code pair for the structure according to the corresponding dom tree of the second webpage source code The dom tree answered is in graphic data base.
"at least one" used herein, " one or more " and "and/or" are open statements, when in use It can be united and separation.For example, " at least one of A, B and C ", " at least one of A, B or C ", " in A, B and C One or more " and " one or more of A, B or C " refer to only A, only B, only C, A and B together, A and C together, B and C together or A, B and C together.
"one" entity of term refers to one or more entities.Thus term "one", " one or more " and " extremely Few one " be herein defined as may be used interchangeably.It should also be noted that the terms "include", "comprise" and " having " are also can be mutual It changes and uses.
Term " automatic " used herein and its modification refer to do not have when executing processing or operation it is tangible artificial Any processing or operation completed in the case where input.However, even if having used the execution place when executing processing or operation The essence received before reason or operation or immaterial artificial input, the processing or operation are also possible to automatically.If Input influences how the processing or operation will carry out, then is substantive depending on the artificial input.The processing or operation are not influenced The artificial input carried out is not to be taken as substantive.
Term " computer-readable medium " used herein refers to that participation provides instructions to any of processor execution Tangible storage device and/or transmission medium.Computer-readable medium can be in network transmission (such as SOAP) on ip networks The serial command collection of coding.Such medium can take many forms, and including but not limited to non-volatile media, volatibility is situated between Matter and transmission medium.Non-volatile media disk including such as NVRAM or magnetically or optically.Volatile media includes such as main memory Dynamic memory (such as RAM).The common form of computer-readable medium includes such as floppy disk, flexible disk, hard disk, tape or appoints What its magnetic medium, magnet-optical medium, CD-ROM, any other optical medium, punched card, paper tape, it is any other have hole shape pattern Physical medium, RAM, PROM, EPROM, FLASH-EPROM, the solid state medium of such as storage card, any other storage chip or Any other medium that cassette, the carrier wave described below or computer can be read.The digital file attachment of Email or Other self-contained news files or archive set are considered as the distribution medium for being equivalent to tangible media.Work as computer-readable medium When being configured as database, it should be appreciated that the database can be any kind of database, such as relational database, number of levels According to library, OODB Object Oriented Data Base etc..Correspondingly, it is believed that the present invention includes tangible media or distribution medium and existing skill Equivalent well known to art and the medium of the following exploitation, store software implementation of the invention in these media.
Term " determination ", " operation " and " calculating " used herein and its modification may be used interchangeably, and including appointing Method, processing, mathematical operation or the technology of what type.More specifically, such term may include the explanation rule of such as BPEL Then or rule language, wherein logic is not hard coded but can be by table in the rule file of reading, explanation, compiling and execution Show.
Term " module " used herein or " tool " refer to hardware that is any of or developing later, software, consolidate Part, artificial intelligence, fuzzy logic or be able to carry out function relevant to the element hardware and software combination.In addition, though The present invention is described with illustrative embodiments, it is to be understood that each aspect of the present invention can individually be claimed.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or the terminal device that include a series of elements not only include those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or end The intrinsic element of end equipment.In the absence of more restrictions, being limited by sentence " including ... " or " including ... " Element, it is not excluded that there is also other elements in process, method, article or the terminal device for including the element.This Outside, herein, " being greater than ", " being less than ", " being more than " etc. are interpreted as not including this number;" more than ", " following ", " within " etc. understand Being includes this number.
Although the various embodiments described above are described, once a person skilled in the art knows basic wounds The property made concept, then additional changes and modifications can be made to these embodiments, so the above description is only an embodiment of the present invention, It is not intended to limit scope of patent protection of the invention, it is all to utilize equivalent structure made by description of the invention and accompanying drawing content Or equivalent process transformation, being applied directly or indirectly in other relevant technical fields, similarly includes in patent of the invention Within protection scope.

Claims (10)

1. web data monitoring method characterized by comprising
Compare each node data of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code, obtains webpage Monitoring result;
Wherein each node data of the corresponding dom tree of the second webpage source code is the knot according to the corresponding dom tree of the second webpage source code Structure is stored in graphic data base.
2. the method according to claim 1, which is characterized in that the corresponding dom tree of first webpage source code of comparison and the second net Each node data of the corresponding dom tree of page source code, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
When according to the dom tree of the first webpage source code of traversal, the access order of label node x obtains the from graphic data base The corresponding node data y of the dom tree of two webpage source codes;
Compare label node x data and the corresponding node data y,
Compare all label nodes of the dom tree of the first webpage source code until traversing.
3. the method according to claim 1, which is characterized in that compare the corresponding dom tree of the first webpage source code and the second web page source Each node data of the corresponding dom tree of code, comprising:
Parse the dom tree that the first webpage source code obtains the first webpage source code;
The node data y of the dom tree of the second webpage source code is obtained from graphic data base;
According to the relationship of graphic data base interior joint data and node data, the corresponding mark of the dom tree of the first webpage source code is obtained Sign the data of node x;
Compare label node x data and the corresponding node data y;
Compare all node datas of the dom tree of the second webpage source code until traversing.
4. the method according to claim 1, which is characterized in that store the corresponding DOM of the second webpage source code in graphic data base The mode of each node data of tree is corresponding with the traversal mode of the first webpage source code.
5. the method according to claim 1, wherein the graphic data base is Neo4j.
6. the method according to claim 1, wherein before the first webpage source code of the receiving, further includes:
The dom tree for traversing the second webpage source code obtains the root node of dom tree to the node data of leaf node;
The access order of node, orderly memory node data during according to the dom tree of the second webpage source code of traversal.
7. the method according to claim 1, wherein node data include the nodename of node, nodal community, The set membership of node content and node.
8. a kind of computer equipment, comprising: memory, processor and be stored on the memory and can be in the processor The computer program of upper execution, which is characterized in that the processor is realized when executing described program such as any institute of claim 1-7 The step of stating method.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of the method as any such as claim 1-7 is realized when row.
10. a kind of equipment, which is characterized in that including comparison module, object module, memory module;
The comparison module compares each node of the corresponding dom tree of the first webpage source code dom tree corresponding with the second webpage source code Data,
The object module is used for, and obtains webpage monitoring result;
It is corresponding to store the second webpage source code for the structure according to the corresponding dom tree of the second webpage source code for the memory module Dom tree is in graphic data base.
CN201710552265.6A 2017-07-07 2017-07-07 Web data monitoring method and equipment Pending CN109255088A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710552265.6A CN109255088A (en) 2017-07-07 2017-07-07 Web data monitoring method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710552265.6A CN109255088A (en) 2017-07-07 2017-07-07 Web data monitoring method and equipment

Publications (1)

Publication Number Publication Date
CN109255088A true CN109255088A (en) 2019-01-22

Family

ID=65050920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710552265.6A Pending CN109255088A (en) 2017-07-07 2017-07-07 Web data monitoring method and equipment

Country Status (1)

Country Link
CN (1) CN109255088A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859231A (en) * 2019-04-30 2020-10-30 中移(苏州)软件技术有限公司 Webpage monitoring method, equipment, device and computer storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050039117A1 (en) * 2003-08-15 2005-02-17 Fuhwei Lwo Method, system, and computer program product for comparing two computer files
CN101184105A (en) * 2006-11-18 2008-05-21 国际商业机器公司 Client appartus for updating data
CN102129428A (en) * 2010-01-20 2011-07-20 腾讯科技(深圳)有限公司 Method and device for subscribing information from webpage
CN102193990A (en) * 2011-03-25 2011-09-21 北京世纪互联工程技术服务有限公司 Pattern database and realization method thereof
CN102682098A (en) * 2012-04-27 2012-09-19 北京神州绿盟信息安全科技股份有限公司 Method and device for detecting web page content changes
CN103577526A (en) * 2013-08-01 2014-02-12 星云融创(北京)信息技术有限公司 Method and system as well as browser for verifying page modification
CN104391964A (en) * 2014-12-01 2015-03-04 南京大学 Method for storing source codes into graph database
CN105630902A (en) * 2015-12-21 2016-06-01 明博教育科技股份有限公司 Method for rendering and incrementally updating webpages

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050039117A1 (en) * 2003-08-15 2005-02-17 Fuhwei Lwo Method, system, and computer program product for comparing two computer files
CN101184105A (en) * 2006-11-18 2008-05-21 国际商业机器公司 Client appartus for updating data
CN102129428A (en) * 2010-01-20 2011-07-20 腾讯科技(深圳)有限公司 Method and device for subscribing information from webpage
CN102193990A (en) * 2011-03-25 2011-09-21 北京世纪互联工程技术服务有限公司 Pattern database and realization method thereof
CN102682098A (en) * 2012-04-27 2012-09-19 北京神州绿盟信息安全科技股份有限公司 Method and device for detecting web page content changes
CN103577526A (en) * 2013-08-01 2014-02-12 星云融创(北京)信息技术有限公司 Method and system as well as browser for verifying page modification
CN104391964A (en) * 2014-12-01 2015-03-04 南京大学 Method for storing source codes into graph database
CN105630902A (en) * 2015-12-21 2016-06-01 明博教育科技股份有限公司 Method for rendering and incrementally updating webpages

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡昌龙: "《虚拟社会网络下群行为感知与规律研究》", 30 November 2016, 武汉大学出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859231A (en) * 2019-04-30 2020-10-30 中移(苏州)软件技术有限公司 Webpage monitoring method, equipment, device and computer storage medium

Similar Documents

Publication Publication Date Title
Berlingerio et al. Abacus: frequent pattern mining-based community discovery in multidimensional networks
Goasdoué et al. RDF graph summarization for first-sight structure discovery
Tekli et al. A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics
Dodds et al. Linked data patterns
JP2005092889A (en) Information block extraction apparatus and method for web page
CN103177094A (en) Cleaning method of data of internet of things
Grandi Dynamic class hierarchy management for multi-version ontology-based personalization
Di Iorio et al. Dealing with structural patterns of XML documents
Kiu et al. TaxoFolk: a hybrid taxonomy–folksonomy classification for enhanced knowledge navigation
Cole et al. Suffix trays and suffix trists: Structures for faster text indexing
US9524351B2 (en) Requesting, responding and parsing
CN109255088A (en) Web data monitoring method and equipment
Asghari et al. XML document clustering: techniques and challenges
KR101380605B1 (en) A Hypergraph-based Storage Method for Managing RDF Version
Lim et al. Generalized and lightweight algorithms for automated web forum content extraction
Murolo et al. Revisiting web data extraction using in-browser structural analysis and visual cues in modern web designs
Manica et al. Orion: A cypher-based web data extractor
Gayoso-Cabada et al. Learning object repositories with dynamically reconfigurable metadata schemata
Zheng et al. Research on the Application of XML in Fault Diagnosis IETM
Faheem Intelligent content acquisition in Web archiving
Mattam et al. A Framework for Knowledgebase Curation using Cognitive Web Architecture
Lohmann Conceptualization and visualization of tagging and folksonomies
Keegan et al. Analyzing multi-dimensional networks within mediawikis
Jayanthi et al. Referenced attribute Functional Dependency Database for visualizing web relational tables
Ciaccia et al. The collection index to support complex approximate queries

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190122