CN107203748A - A kind of method and apparatus of webpage notes storage, matching and reduction based on content - Google Patents
A kind of method and apparatus of webpage notes storage, matching and reduction based on content Download PDFInfo
- Publication number
- CN107203748A CN107203748A CN201710350594.2A CN201710350594A CN107203748A CN 107203748 A CN107203748 A CN 107203748A CN 201710350594 A CN201710350594 A CN 201710350594A CN 107203748 A CN107203748 A CN 107203748A
- Authority
- CN
- China
- Prior art keywords
- web page
- strokes
- group
- notes
- webpage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 239000000203 mixture Substances 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 6
- 230000004807 localization Effects 0.000 description 13
- 238000011084 recovery Methods 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/36—Matching; Classification
- G06V30/387—Matching; Classification using human interaction, e.g. selection of the best displayed recognition candidate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
Abstract
The invention discloses a kind of method and apparatus of webpage notes storage, matching and reduction based on content.Wherein, method and step is as follows:The stroke that user inputs on the web page browsing page is obtained first, by combination of strokes into group of strokes, calculates the web page element corresponding to group of strokes.Then notes information is stored by web page address.When showing webpage, according to the corresponding notes information of web page address retrieval, the web page element corresponding to each group of strokes in the notes information found is matched with the web page element in current web page;Result finally according to matching is extracted corresponding group of strokes from the notes information found and reduced.Under the inventive method, when web page contents change, as long as the corresponding content of notes does not change, you can reproduce notes, thus ignore the change of other web page contents.
Description
Technical field
The present invention relates to webpage notes.
Background technology
With the popularization of the mobile terminals such as tablet personal computer, touch screen notebook, smart mobile phone, people take notes change on computers
Obtain and increasingly facilitate.If can directly be taken notes when browsing webpage on webpage;Note contents are stored on network, when
User visits again the webpage next time when, note contents can reappear, and this will significantly facilitate user.But current various nets
Stand usually using dynamic web page, the structure and content of webpage often change, and now, notes can not be consistent right with web page contents
Should.Particularly some webpages with advertisement, access webpage each time, and ad content can all change.But ad content
Change will not produce influence to the body matter of webpage, should not also influence notes.In this case it is necessary to notes and
Webpage carries out uniformity judgement and compared, and completes the storage of webpage notes based on content, matches and reduce.
The content of the invention
Problem to be solved by this invention is of webpage and notes during webpage is taken down notes when web page contents change
Match somebody with somebody.
To solve the above problems, the scheme that the present invention is used is as follows:
According to the method for a kind of webpage notes storage, matching and the reduction based on content of the present invention, comprise the following steps:
S1:The stroke that is inputted on the web page browsing page of user is obtained, by combination of strokes into group of strokes;
S2:Calculate the web page element corresponding to group of strokes;
S3:Notes information is stored by web page address;The notes information includes the set of stroke snapshot;The stroke snapshot includes
Web page element corresponding to group of strokes and group of strokes;
S4:When showing webpage, according to the corresponding notes information of web page address retrieval;
S5:Web page element in web page element and current web page corresponding to each group of strokes in the notes information found is carried out
Matching;
S6:Corresponding group of strokes is extracted from the notes information found according to the result of matching to be reduced.
Further, according to the method for the notes of the webpage based on content storage, matching and the reduction of the present invention, the step S6
Described in the result that matches be web page element in web page element and current web page in notes information corresponding to each group of strokes
Total matching degree;The notes information also includes former snapshots of web pages;The step S6 is handled as follows:
When total matching degree is less than Low threshold, point out user's Web evolution can not reduce notes too much;
When total matching degree is higher than high threshold, group of strokes is reduced on the current web page page;
When total matching degree is located between Low threshold and high threshold, the former snapshots of web pages and each stroke are shown with another window
Group, and synchronously reduce group of strokes on the current web page page.
Further, according to the method for the notes of the webpage based on content storage, matching and the reduction of the present invention, the step S6
Described in the result that matches be web page element in web page element and current web page in notes information corresponding to each group of strokes
Total matching degree;The step S5 includes:
S511:Extract the web page element composition in the notes found in each group of strokes and combine web page element set F1;
S512:Current web page will be put into after each group of strokes coordinate mapping in the notes found, using step S2 method,
Determine each group of strokes corresponding web page element composition web page element set F2 in current web page;
S513:The common factor for calculating web page element set F1 and F2 obtains web page element set X;
S514:The ratio for calculating element number in element number and F1 in X is used as total matching degree.
Further, according to the method for the notes of the webpage based on content storage, matching and the reduction of the present invention, the notes letter
Breath also includes the depth-width ratio of former webpage;The result matched described in the step S6 is corresponding to each group of strokes in notes information
Web page element and current web page in web page element total matching degree;The height that the step S5 also includes comparing current web page is wide
Than with the former webpage depth-width ratio in notes information, judging both, whether difference is excessive, if both differences are excessive, total matching degree
It is set to 0.
Further, it is described current according to the method for the notes of the webpage based on content storage, matching and the reduction of the present invention
When reducing group of strokes on Webpage, current web page is put into after each group of strokes coordinate is mapped, according to step S2 method, meter
Calculate each group of strokes corresponding web page element in current web page;Extract each group of strokes corresponding webpage member in former webpage
Element;Judge whether each group of strokes corresponding web page element in current web page and former webpage matches;For the group of strokes of matching
Normally shown, otherwise shown with prompting mode.
According to the device of a kind of webpage notes storage, matching and the reduction based on content of the present invention, including with lower module:
M1, is used for:The stroke that is inputted on the web page browsing page of user is obtained, by combination of strokes into group of strokes;
M2, is used for:Calculate the web page element corresponding to group of strokes;
M3, is used for:Notes information is stored by web page address;The notes information includes the set of stroke snapshot;The stroke is fast
According to including the web page element corresponding to group of strokes and group of strokes;
M4, is used for:When showing webpage, according to the corresponding notes information of web page address retrieval;
M5, is used for:By the webpage member in the web page element and current web page corresponding to each group of strokes in the notes information found
Element is matched;
M6, is used for:Corresponding group of strokes is extracted from the notes information found according to the result of matching to be reduced.
Further, according to the device of the notes of the webpage based on content storage, matching and the reduction of the present invention, the module M6
Described in the result that matches be web page element in web page element and current web page in notes information corresponding to each group of strokes
Total matching degree;The notes information also includes former snapshots of web pages;The module M6 is handled as follows:
When total matching degree is less than Low threshold, point out user's Web evolution can not reduce notes too much;
When total matching degree is higher than high threshold, group of strokes is reduced on the current web page page;
When total matching degree is located between Low threshold and high threshold, the former snapshots of web pages and each stroke are reduced with another window
Group, and synchronously reduce group of strokes on the current web page page.
Further, according to the device of the notes of the webpage based on content storage, matching and the reduction of the present invention, the module M6
Described in the result that matches be web page element in web page element and current web page in notes information corresponding to each group of strokes
Total matching degree;The module M5 includes:
M511, is used for:Extract the web page element composition in the notes found in each group of strokes and combine web page element set F1;
M512, is used for:Current web page will be put into after each group of strokes coordinate mapping in the notes found, it is true by module M2
Each fixed group of strokes corresponding web page element composition web page element set F2 in current web page;
M513, is used for:The common factor for calculating web page element set F1 and F2 obtains web page element set X;
M514, is used for:The ratio for calculating element number in element number and F1 in X is used as total matching degree.
Further, according to the device of the notes of the webpage based on content storage, matching and the reduction of the present invention, the notes letter
Breath also includes the depth-width ratio of former webpage;The result matched described in the module M6 is corresponding to each group of strokes in notes information
Web page element and current web page in web page element total matching degree;The height that the module M5 also includes comparing current web page is wide
Than with the former webpage depth-width ratio in notes information, judging both, whether difference is excessive, if both differences are excessive, total matching degree
It is set to 0.
Further, it is described current according to the device of the notes of the webpage based on content storage, matching and the reduction of the present invention
When reducing group of strokes on Webpage, current web page is put into after each group of strokes coordinate is mapped, calculates each by module M2
Group of strokes corresponding web page element in current web page;Extract each group of strokes corresponding web page element in former webpage;Judge
Whether each group of strokes corresponding web page element in current web page and former webpage matches;Group of strokes for matching is carried out normally
It has been shown that, is otherwise shown with prompting mode.
The technique effect of the present invention is as follows:The present invention is clocked storage web page element by pen, during reduction notes, is passed through
The web page element and current web page content of storage are compared matching, are then reproduced and taken down notes according to the result of matching.In this side
Under method, when web page contents change, as long as the corresponding content of notes does not change, you can reproduce notes, thus without
Depending on the change of other web page contents.
Embodiment
The present invention is described in further details below.
The present embodiment is related to client, cloud storage service device and web page server.Client can be desktop personal computer,
It can also be notebook, the tablet personal computer even mobile terminal such as smart mobile phone.Web browser is installed in client.The present embodiment
It is a kind of notes plug-in unit realized on web browser.When user is by web browser connection web page server, webpage is shown
When, user can realize the function that webpage is taken down notes by taking down notes plug-in unit on web browser.The notes plug-in unit connects cloud storage
Server, by the webpage recorded on client terminal web page browser notes deposit cloud storage service device.The notes plug-in unit includes:Pen
Remember that editor module, network element are reduced to lighting module, notes memory module, notes retrieval module, notes matching module and notes
Module.Notes editor module is used for the UI interfaces for providing a user webpage notes editor, shows that user inputs on current web page
Stroke and group of strokes, and change function there is provided the additions and deletions of group of strokes.Network element is used for true according to group of strokes to lighting module
Determine the web page element corresponding to group of strokes.Notes memory module is used for the group of strokes that inputs user and the net corresponding to group of strokes
The notes of page element composition are preserved into cloud storage service device.Notes retrieval module is used to be deposited in cloud according to the address of current web page
Corresponding notes are searched in storage server.Matching module is taken down notes by the webpage corresponding to each group of strokes in the notes information found
Element is matched with the web page element in current web page.Notes recovery module is believed according to the result of matching from the notes found
Corresponding group of strokes is extracted in breath to be reduced.Take down notes editor module and correspond to foregoing step S1 and module M1, that is, " step
" stroke " and " group of strokes " of " acquisition " in rapid S1 and module M1 " is that user operates writing pencil or mouse editor in UI interfaces
Formed.This is technology familiar to those skilled in the art, and this specification is repeated no more.It is pointed out that " stroke
Group " is the concept of a logic, is determined by user.Such as, a bracket is made up of left bracket and right parenthesis, left bracket and right parenthesis
It is stroke, the stroke for individually preserving left bracket and right parenthesis lacks meaning, it is necessary to by two group of strokes of left bracket and right parenthesis
Certain logic implication could be represented into bracket, this bracket being made up of two strokes is exactly " group of strokes ".
Below to network element is to lighting module, notes memory module, notes retrieval module, notes matching module and takes down notes also
Grand master pattern block is described in further detail.
First, network element is to lighting module
Network element corresponds to foregoing step S2 and module M2 to lighting module.Web page element is pair that html tag is marked
As usually text type, is familiar with by this area.Web page element corresponding to group of strokes can be the webpage member of leaf node
The web page element of element or non-leaf nodes.
It with web page element corresponding relation embodiment the simplest is specified by user to determine group of strokes.Namely with
Family needs to specify each group of strokes the web page element corresponding to the group of strokes.After user generates group of strokes, notes editor's mould
Block requires that user specifies at least one web page element as web page element, if user not named web page element then the group of strokes is given birth to
Into failure.
Determine that group of strokes and web page element corresponding relation can use semiautomatic fashion.Under the embodiment, user is in pen
Remember during editor's group of strokes in editor module, it is necessary to specify the type of the group of strokes.Then according to the type and group of strokes of group of strokes
Corresponding coordinate determines localization region, then the webpage corresponding to the web page element as the group of strokes covered using localization region
Element.Determine localization region and determine that the corresponding web page element process of group of strokes is to perform journey by computer according to localization region
What sequence was carried out automatically processes process, and the type of group of strokes then needs user to intervene, therefore is a kind of automanual mode.
Determine that group of strokes uses full automatic mode, including following step with web page element corresponding relation in the present embodiment
Suddenly:
S21:The type of group of strokes is judged by the analysis of group of strokes own form;
S22:Then the coordinate according to corresponding to the type and group of strokes of group of strokes determines localization region;
S23:The web page element corresponding to web page element as the group of strokes covered using localization region.
During above-mentioned steps S21, S22, S23, the type of group of strokes is divided into:Closed type, underscore type, deletion
Line type, bracket type, quotation marks type, connecting line type, text type.For the group of strokes of closed type.Localization region can
To be the region covered of group of strokes or the region covered of the group of strokes certain distance that stretches out is formed
Region.For the group of strokes of underscore type, localization region is the region covered that stroke is extended a distance up.For
The group of strokes of strikethrough type, localization region is the region covered for extending certain distance above and below stroke.For bracket type
Group of strokes, localization region is the region that top water horizontal line and bottom water horizontal line are covered between bracket.For the pen of quotation marks type
Group is drawn, localization region is that horizontal line extends downwardly the region that certain distance is covered between bracket.For the stroke of connecting line type
Group, localization region is using border circular areas of the connecting line terminal as the center of circle, at a certain distance for radius.For the stroke of text type
Group, localization region is the region that the text filed certain distance that stretches out is covered.
Above-mentioned steps S21 judges that the type of group of strokes comprises the following steps by the analysis of group of strokes own form:
S211:Closing whether is constituted by analyzing group of strokes and determines whether closed type, is then returned if closed type;
S212:Whether sentenced by the minimax Y-axis coordinate difference and minimax X-axis coordinate difference that calculate stroke more than limit value
Whether disconnected is underscore type or strikethrough type;If not less than limit value, then by analyzing whether group of strokes is located at certain webpage
It is underscore type or strikethrough type that the lower section of element, which judges,;
S213:The stroke that whether there is left bracket and right parenthesis in group of strokes by analyzing judges whether group of strokes is bracket class
Type;
S214:By analyze in group of strokes with the presence or absence of two double quotation marks judge group of strokes whether quotation marks type;
S215:Judge whether group of strokes is connecting line type with the presence or absence of the lines with arrow by analyzing in group of strokes;
S216:If group of strokes the above-mentioned type can not all be met, it is text type to assert the group of strokes.
It is pointed out that the web page element corresponding to group of strokes is the set of web page element, illustrate that group of strokes can be right
Should be in multiple web page elements.
2nd, memory module is taken down notes
Take down notes memory module correspondence foregoing step S3 and module M3.In the present embodiment, notes information is stored in cloud store-service
In device.It will be appreciated by those skilled in the art that can also to store client local for notes information.Storage mode can be by file side
Formula, can also pass through database mode.Notes information is stored by web page address, thus being capable of convenient search when notes retrieval
Arrive.Specifically, when being stored with database mode, crucial docuterm is used as using web page address;, can when file mode is stored
Filename is used as using web page address.Notes information includes the set of webpage metamessage, snapshots of web pages and stroke snapshot.Webpage member letter
Breath includes web page title, access time, webpage depth-width ratio.Snapshots of web pages can be webpage capture or html document.Examine
Consider and consider that the processing mode of big CSS files is comparatively laborious under html document, therefore the present embodiment is preferentially made from webpage capture
For snapshots of web pages.Stroke snapshot includes group of strokes, timestamp, the corresponding web page element of group of strokes.
3rd, notes retrieval module
Notes retrieval module correspondence foregoing step S4 and module M4.Search whether there are corresponding notes according to web page address
Information.Notes retrieval, to the storage mode of notes information, is those skilled in the art institute dependent on foregoing notes memory module
Known, this specification is repeated no more.
4th, notes matching module and notes recovery module
Take down notes matching module correspondence foregoing step S5 and module M5.Take down notes the foregoing step S6 of recovery module correspondence and module
M6.Result of the notes reduction dependent on notes matching, both relevances are very strong, can also be combined into a step or module, match
Recovery module.Match recovery module has implemented a variety of modes.Mode the simplest is that notes matching module is direct
The group of strokes matched is matched, the result of matching is exactly the group of strokes matched, then show this in notes recovery module
The group of strokes matched a bit.In the present embodiment, the result of matching is total matching degree, and notes matching module is to calculate total matching
Degree, notes recovery module is then to extract corresponding group of strokes from the notes information found according to total matching degree to show.Calculate
The specific method of total matching degree is as follows:Compare webpage essential information first, i.e., it is basic by the webpage preserved in notes information
Information and the essential information of current web page compare, and specifically compare depth-width ratio and the depth-width ratio of current web page in notes information,
If the ratio of the depth-width ratio of depth-width ratio and current web page in notes information is more than 1.5 or less than 0.7, then it is assumed that current web page
It is excessive with former webpage gap, if total matching degree is 0 return, otherwise continue total matching degree calculation procedure below.
Total matching degree calculation procedure can use following several embodiments.The first embodiment implement as
Under:
S511:Extract the web page element composition in the notes found in each group of strokes and combine web page element set F1;
S512:Current web page will be put into after each group of strokes coordinate mapping in the notes found, using step S2 method,
Determine each group of strokes corresponding web page element composition web page element set F2 in current web page;
S513:The common factor for calculating web page element set F1 and F2 obtains web page element set X;
S514:The ratio for calculating element number in element number and F1 in X is used as total matching degree.
Second embodiment is implemented as follows:
S521:Calculate in the web page element in the notes found in each group of strokes and the web page contents corresponding to web-page requests
The group of strokes number Nk that matches completely of web page element;
S522:Nk and Nm ratio is calculated as total matching degree, wherein Nm is the number of group of strokes in the notes found.
The third embodiment is implemented as follows:
S531:Extract the web page element composition in the notes found in each group of strokes and combine web page element set F;
S532:The common factor for calculating the web page element in current web page and F obtains web page element set X;
S533:The ratio for calculating element number in element number and F in X is used as total matching degree.
The present embodiment preferentially uses the first above-mentioned embodiment.
It can be seen from above-mentioned several embodiments, total matching degree is the numerical value between 0 and 1.According to total matching degree from
The method that corresponding group of strokes shows is extracted in the notes information found also many kinds, and mode the simplest is to give
One threshold value, such as 0.5, judge whether total matching degree is more than the threshold value, if matching degree is more than the threshold value then in current web page
Group of strokes is shown on the page, does not otherwise show or point out user's Web evolution can not reduce notes too much.
The present embodiment employs the mode of dual threshold.Under which, previously given two threshold values:High threshold and low threshold
Value.When total matching degree is less than Low threshold, point out user's Web evolution can not reduce notes too much;When total matching degree is higher than high threshold
During value, group of strokes is shown on the current web page page;When total matching degree is located between Low threshold and high threshold, another window is used
The former snapshots of web pages and each group of strokes are shown, and group of strokes is shown on the current web page page.It is, ought always match
When degree is located between Low threshold and high threshold, group of strokes is shown by way of control, facilitates user to compare.
The present embodiment uses following method when group of strokes is reduced on the above-mentioned current web page page:By each group of strokes coordinate
Current web page is put into after mapping, each group of strokes corresponding web page element in current web page is calculated by module M2;Extract every
Individual group of strokes corresponding web page element in former webpage;Judge each group of strokes corresponding webpage in current web page and former webpage
Whether element matches;Group of strokes for matching is normally shown, is otherwise shown with prompting mode.Such as, it is normal aobvious
Black is used when showing group of strokes, and group of strokes is shown using other modes such as grey or Red Yellows under prompting mode.Thus
Whether the web page element that user can be distinguished corresponding to group of strokes is corresponding with the web page element of former webpage.
Claims (10)
1. a kind of method of webpage notes storage, matching and reduction based on content, it is characterised in that comprise the following steps:
S1:The stroke that is inputted on the web page browsing page of user is obtained, by combination of strokes into group of strokes;
S2:Calculate the web page element corresponding to group of strokes;
S3:Notes information is stored by web page address;The notes information includes the set of stroke snapshot;The stroke snapshot includes
Web page element corresponding to group of strokes and group of strokes;
S4:When showing webpage, according to the corresponding notes information of web page address retrieval;
S5:Web page element in web page element and current web page corresponding to each group of strokes in the notes information found is carried out
Matching;
S6:Corresponding group of strokes is extracted from the notes information found according to the result of matching to be reduced.
2. the method that the webpage notes based on content are stored, match and reduced as claimed in claim 1, it is characterised in that institute
The result matched described in step S6 is stated in the web page element and current web page corresponding to each group of strokes in notes information
Total matching degree of web page element;The notes information also includes former snapshots of web pages;The step S6 is handled as follows:
When total matching degree is less than Low threshold, point out user's Web evolution can not reduce notes too much;
When total matching degree is higher than high threshold, group of strokes is reduced on the current web page page;
When total matching degree is located between Low threshold and high threshold, the former snapshots of web pages and each stroke are shown with another window
Group, and synchronously reduce group of strokes on the current web page page.
3. the method that the webpage notes based on content are stored, match and reduced as claimed in claim 1, it is characterised in that institute
The result matched described in step S6 is stated in the web page element and current web page corresponding to each group of strokes in notes information
Total matching degree of web page element;The step S5 includes:
S511:Extract the web page element composition web page element set F1 in the notes found in each group of strokes;
S512:Current web page will be put into after each group of strokes coordinate mapping in the notes found, using step S2 method,
Determine each group of strokes corresponding web page element in current web page, composition web page element set F2;
S513:The common factor for calculating web page element set F1 and F2 obtains web page element set X;
S514:The ratio for calculating element number in element number and F1 in X is used as total matching degree.
4. the method that the webpage notes based on content are stored, match and reduced as claimed in claim 1, it is characterised in that institute
Stating notes information also includes the depth-width ratio of former webpage;The result matched described in the step S6 is each stroke in notes information
Total matching degree of web page element in group corresponding web page element and current web page;The step S5 also includes more current net
Former webpage depth-width ratio in the depth-width ratio and notes information of page, judging both, whether difference is excessive, if both differences are excessive,
Total matching degree is set to 0.
5. the method that the webpage notes based on content are stored, match and reduced as claimed in claim 2, it is characterised in that
When reducing group of strokes on the current web page page, current web page is put into after each group of strokes coordinate is mapped, and according to step S2's
Method, calculates each group of strokes corresponding web page element in current web page;Extract each group of strokes corresponding in former webpage
Web page element;Judge whether each group of strokes corresponding web page element in current web page and former webpage matches;For matching
Group of strokes is normally shown, is otherwise shown with prompting mode.
6. a kind of device of webpage notes storage, matching and reduction based on content, it is characterised in that including with lower module:
M1, is used for:The stroke that is inputted on the web page browsing page of user is obtained, by combination of strokes into group of strokes;
M2, is used for:Calculate the web page element corresponding to group of strokes;
M3, is used for:Notes information is stored by web page address;The notes information includes the set of stroke snapshot;The stroke is fast
According to including the web page element corresponding to group of strokes and group of strokes;
M4, is used for:When showing webpage, according to the corresponding notes information of web page address retrieval;
M5, is used for:By the webpage member in the web page element and current web page corresponding to each group of strokes in the notes information found
Element is matched;
M6, is used for:Corresponding group of strokes is extracted from the notes information found according to the result of matching to be reduced.
7. the device that the webpage notes based on content are stored, match and reduced as claimed in claim 6, it is characterised in that institute
The result matched described in module M6 is stated in the web page element and current web page corresponding to each group of strokes in notes information
Total matching degree of web page element;The notes information also includes former snapshots of web pages;The module M6 is handled as follows:
When total matching degree is less than Low threshold, point out user's Web evolution can not reduce notes too much;
When total matching degree is higher than high threshold, group of strokes is shown on the current web page page;
When total matching degree is located between Low threshold and high threshold, the former snapshots of web pages and each stroke are shown with another window
Group, and synchronously reduce group of strokes on the current web page page.
8. the device that the webpage notes based on content are stored, match and reduced as claimed in claim 6, it is characterised in that institute
The result matched described in module M6 is stated in the web page element and current web page corresponding to each group of strokes in notes information
Total matching degree of web page element;The module M5 includes:
M511, is used for:Extract the web page element composition in the notes found in each group of strokes and combine web page element set F1;
M512, is used for:Current web page will be put into after each group of strokes coordinate mapping in the notes found, it is true by module M2
Fixed each group of strokes corresponding web page element in current web page, composition web page element set F2;
M513, is used for:The common factor for calculating web page element set F1 and F2 obtains web page element set X;
M514, is used for:The ratio for calculating element number in element number and F1 in X is used as total matching degree.
9. the device that the webpage notes based on content are stored, match and reduced as claimed in claim 6, it is characterised in that institute
Stating notes information also includes the depth-width ratio of former webpage;The result matched described in the module M6 is each stroke in notes information
Total matching degree of web page element in group corresponding web page element and current web page;The module M5 also includes more current net
Former webpage depth-width ratio in the depth-width ratio and notes information of page, judging both, whether difference is excessive, if both differences are excessive,
Total matching degree is set to 0.
10. the device that the webpage notes based on content are stored, match and reduced as claimed in claim 7, it is characterised in that institute
State when showing group of strokes on the current web page page, will each group of strokes coordinate map after be put into current web page, pass through module M2
Calculate each group of strokes corresponding web page element in current web page;Extract each group of strokes corresponding webpage member in former webpage
Element;Judge whether each group of strokes corresponding web page element in current web page and former webpage matches;For the group of strokes of matching
Normally shown, otherwise shown with prompting mode.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710350594.2A CN107203748B (en) | 2017-05-18 | 2017-05-18 | Method and device for storing, matching and restoring webpage notes based on content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710350594.2A CN107203748B (en) | 2017-05-18 | 2017-05-18 | Method and device for storing, matching and restoring webpage notes based on content |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107203748A true CN107203748A (en) | 2017-09-26 |
CN107203748B CN107203748B (en) | 2020-12-22 |
Family
ID=59905719
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710350594.2A Active CN107203748B (en) | 2017-05-18 | 2017-05-18 | Method and device for storing, matching and restoring webpage notes based on content |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107203748B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112486532A (en) * | 2020-11-25 | 2021-03-12 | 中移(杭州)信息技术有限公司 | Method and device for managing configuration file, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1625741A (en) * | 2002-01-31 | 2005-06-08 | 西尔弗布鲁克研究有限公司 | An electronic filing system searchable by a handwritten search query |
CN101441644A (en) * | 2007-11-19 | 2009-05-27 | 英福达科技股份有限公司 | Web page annotation system and method |
CN101551800A (en) * | 2008-03-31 | 2009-10-07 | 富士通株式会社 | Marked information generation device, inquiry unit and sharing system |
CN102609401A (en) * | 2011-12-26 | 2012-07-25 | 北京大学 | Webpage annotation method |
US20140344658A1 (en) * | 2013-05-15 | 2014-11-20 | Microsoft Corporation | Enhanced links in curation and collaboration applications |
CN104615601A (en) * | 2013-11-04 | 2015-05-13 | 英业达科技有限公司 | Webpage based recording system and method thereof |
CN104794174A (en) * | 2015-04-01 | 2015-07-22 | 百度在线网络技术(北京)有限公司 | Webpage marking information display method and device |
-
2017
- 2017-05-18 CN CN201710350594.2A patent/CN107203748B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1625741A (en) * | 2002-01-31 | 2005-06-08 | 西尔弗布鲁克研究有限公司 | An electronic filing system searchable by a handwritten search query |
CN101441644A (en) * | 2007-11-19 | 2009-05-27 | 英福达科技股份有限公司 | Web page annotation system and method |
CN101551800A (en) * | 2008-03-31 | 2009-10-07 | 富士通株式会社 | Marked information generation device, inquiry unit and sharing system |
CN102609401A (en) * | 2011-12-26 | 2012-07-25 | 北京大学 | Webpage annotation method |
US20140344658A1 (en) * | 2013-05-15 | 2014-11-20 | Microsoft Corporation | Enhanced links in curation and collaboration applications |
CN104615601A (en) * | 2013-11-04 | 2015-05-13 | 英业达科技有限公司 | Webpage based recording system and method thereof |
CN104794174A (en) * | 2015-04-01 | 2015-07-22 | 百度在线网络技术(北京)有限公司 | Webpage marking information display method and device |
Non-Patent Citations (1)
Title |
---|
朱小辉: "基于教育云的学习笔记跨平台的研究与实现", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112486532A (en) * | 2020-11-25 | 2021-03-12 | 中移(杭州)信息技术有限公司 | Method and device for managing configuration file, electronic equipment and storage medium |
CN112486532B (en) * | 2020-11-25 | 2024-04-09 | 中移(杭州)信息技术有限公司 | Configuration file management method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107203748B (en) | 2020-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vishwakarma et al. | Detection and veracity analysis of fake news via scrapping and authenticating the web search | |
US7917514B2 (en) | Visual and multi-dimensional search | |
CN110263180B (en) | Intention knowledge graph generation method, intention identification method and device | |
US6996268B2 (en) | System and method for gathering, indexing, and supplying publicly available data charts | |
JP2005085285A5 (en) | ||
US8312012B1 (en) | Automatic determination of whether a document includes an image gallery | |
CN106709032A (en) | Method and device for extracting structured information from spreadsheet document | |
KR20100044669A (en) | Method, system and computer-readable recording medium for providing information on goods based on image matching | |
CN110457579B (en) | Webpage denoising method and system based on cooperative work of template and classifier | |
WO2022105119A1 (en) | Training corpus generation method for intention recognition model, and related device thereof | |
CN104090904A (en) | Method and equipment for providing target search result | |
CN105930174B (en) | A kind of graphical page program comparison in difference method and system | |
CN104317867B (en) | The system that entity cluster is carried out to the Web page picture that search engine returns | |
CN110232126A (en) | Hot spot method for digging and server and computer readable storage medium | |
CN110020312A (en) | The method and apparatus for extracting Web page text | |
CN103942211A (en) | Text page recognition method and device | |
CN108647312A (en) | A kind of user preference analysis method and its device | |
CN103631796A (en) | Website sort management method and electronic device | |
CN102236713A (en) | Digital television interaction service page information extraction method and device | |
CN108628871A (en) | A kind of link De-weight method based on chain feature | |
CN107203748A (en) | A kind of method and apparatus of webpage notes storage, matching and reduction based on content | |
Cameron et al. | Mesogranular structure in a hydrodynamical simulation | |
CN105550183A (en) | Identifying method of identifying information in webpage and electronic device | |
CN110866170A (en) | Importance evaluation method, search method and system for Tor darknet service based on site quality | |
Li et al. | Cleaning web pages for effective web content mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |