CN108804431A - A kind of keyword effect analysis method based on big data - Google Patents
A kind of keyword effect analysis method based on big data Download PDFInfo
- Publication number
- CN108804431A CN108804431A CN201710281439.XA CN201710281439A CN108804431A CN 108804431 A CN108804431 A CN 108804431A CN 201710281439 A CN201710281439 A CN 201710281439A CN 108804431 A CN108804431 A CN 108804431A
- Authority
- CN
- China
- Prior art keywords
- keyword
- path
- browsing
- aggregators
- effect analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a kind of keyword effect analysis method based on big data, includes the following steps:A, keyword will occur to record, while the result for using corresponding keyword to be retrieved will be recorded;B, the different viewing path for belonging to same keyword is classified as one group, the browse path that similarity is more than to first threshold merges, the browsing time in statistics different viewing path, browsing hit rate;C, the different viewing path for belonging to same retrieval result is classified as one group, the browse path that similarity is more than to first threshold merges, the browsing time in statistics different viewing path, browsing hit rate;D, the statistical data obtained in step B and step C is analyzed, obtains keyword effect analysis results.The present invention can solve the deficiencies in the prior art, improve the speed of data analysis.
Description
Technical field
The present invention relates to big data analysis technical field, especially a kind of keyword effect analysis side based on big data
Method.
Background technology
Search engine is the cyber stalker that present numerous netizens are commonly used.What search engine was shown searches
Hitch fruit plays the crucial effect of non-production for improving website browsing amount.Due to big data technology have information comprehensively, result
The high feature of validity, the validity promoted to targeted website pageview to search key using big data technology are analyzed
It is a kind of common method.But existing analysis method is all directly to be counted to mass data, and it is computationally intensive, cause
Analytical effect real-time is bad.
Invention content
The technical problem to be solved in the present invention is to provide a kind of keyword effect analysis method based on big data, can solve
Certainly the deficiencies in the prior art improve the speed of data analysis.
In order to solve the above technical problems, the technical solution used in the present invention is as follows.
A kind of keyword effect analysis method based on big data, includes the following steps:
A, keyword will occur to record, while the result for using corresponding keyword to be retrieved will be recorded;
B, the different viewing path for belonging to same keyword is classified as one group, similarity is more than to the browsing of first threshold
Path merges, the browsing time in statistics different viewing path, browsing hit rate;
C, the different viewing path for belonging to same retrieval result is classified as one group, similarity is more than the clear of first threshold
Looking at path merges, the browsing time in statistics different viewing path, browsing hit rate;
D, the statistical data obtained in step B and step C is analyzed, obtains keyword effect analysis results.
Preferably, in step A, according to the similarity of retrieval result, on the basis of each keyword, by remaining key
Word is divided into associated group and dereferenced group.
Preferably, in step B, the browse path of the associated group of pending keyword and pending keyword will be belonged to
Browse path is merged;Several aggregators, each aggregators setting and other browsing roads are set on browse path
The session permission that aggregators on diameter are merged, when data pass through some aggregators and meet the session of the aggregators
When permission, it is that the data establish ephemeral data mapping in this aggregators, maps that in corresponding aggregators, receive number
According to aggregators mapping path and mapping result are preserved.
Preferably, in step C, the entropy of each browse path after merging is calculated, the browsing of second threshold is higher than to entropy
It is deleted in path.
Preferably, in step C, the hinged node in the browse path of reservation is traversed, hinged node is obtained
Characteristic function F (x, y), wherein x are from the external link for being directed toward the hinged node, and y is from the chain outside hinged node direction
It connects;Characteristic function F (x, y) is marked in the browsing time and browsing hit rate obtained using statistics.
Preferably, the browsing time obtained in step B and step C and browsing hit rate are normalized, so
Summation process is weighted to it afterwards, acquired results are directly proportional to keyword effect.
It is using advantageous effect caused by above-mentioned technical proposal:The present invention to browse path by carrying out two-way point
Analysis, accelerates the speed for data processing.During forward analysis, by the fusion of browse path, it can effectively reduce
The amount of computing repeatedly.In reversed analytic process, by the way that hinged node progress signature analysis, invalid data can be effectively removed,
Improve data-handling efficiency.The big data processing method of the present invention effectively has evaded falling in big data processing procedure due to data
Measure the slow problem of the data processing speed brought greatly.
Specific implementation mode
The specific embodiment of the present invention includes the following steps:
A, keyword will occur to record, while the result for using corresponding keyword to be retrieved will be recorded;
B, the different viewing path for belonging to same keyword is classified as one group, similarity is more than to the browsing of first threshold
Path merges, the browsing time in statistics different viewing path, browsing hit rate;
C, the different viewing path for belonging to same retrieval result is classified as one group, similarity is more than the clear of first threshold
Looking at path merges, the browsing time in statistics different viewing path, browsing hit rate;
D, the statistical data obtained in step B and step C is analyzed, obtains keyword effect analysis results.
In step A, according to the similarity of retrieval result, on the basis of each keyword, remaining keyword is divided into pass
Connection group and dereferenced group.
In step B, the browse path of the browse path and pending keyword of the associated group of pending keyword will be belonged to
It is merged;Several aggregators, each aggregators setting and melting on other browse paths are set on browse path
The session permission that node is merged is closed, when data are by some aggregators and meet the session permission of the aggregators,
It is that the data establish ephemeral data mapping in this aggregators, maps that in corresponding aggregators, receive melting for data
Node is closed to preserve mapping path and mapping result.
In step C, the entropy of each browse path after merging is calculated, the browse path to entropy higher than second threshold is deleted
It removes.
In step C, the hinged node in the browse path of reservation is traversed, obtains the characteristic function F of hinged node
(x, y), wherein x are from the external link for being directed toward the hinged node, and y is from the link outside hinged node direction;Use system
It counts the obtained browsing time and characteristic function F (x, y) is marked in browsing hit rate.
The browsing time obtained in step B and step C and browsing hit rate are normalized, then it is carried out
Weighted sum is handled, and acquired results are directly proportional to keyword effect.
When aggregators and hinged node overlap, x and y is replaced using the mapping of corresponding ephemeral data, obtains new spy
Function F ' is levied, uses F ' to be modified F, to the consistency for improving forward analysis and reversely analyzing.
Foregoing description is only proposed as the enforceable technical solution of the present invention, not as to the single of its technical solution itself
Restrictive condition.
Claims (6)
1. a kind of keyword effect analysis method based on big data, it is characterised in that include the following steps:
A, keyword will occur to record, while the result for using corresponding keyword to be retrieved will be recorded;
B, the different viewing path for belonging to same keyword is classified as one group, similarity is more than to the browse path of first threshold
It merges, the browsing time in statistics different viewing path, browsing hit rate;
C, the different viewing path for belonging to same retrieval result is classified as one group, similarity is more than to the browsing road of first threshold
Diameter merges, the browsing time in statistics different viewing path, browsing hit rate;
D, the statistical data obtained in step B and step C is analyzed, obtains keyword effect analysis results.
2. the keyword effect analysis method according to claim 1 based on big data, it is characterised in that:In step A, root
According to the similarity of retrieval result, on the basis of each keyword, remaining keyword is divided into associated group and dereferenced group.
3. the keyword effect analysis method according to claim 2 based on big data, it is characterised in that:It, will in step B
The browse path for belonging to the associated group of pending keyword is merged with the browse path of pending keyword;In browse path
The session merged with the aggregators on other browse paths is arranged in upper several aggregators of setting, each aggregators
Permission is the data in this aggregators when data are by some aggregators and meet the session permission of the aggregators
Ephemeral data mapping is established, is mapped that in corresponding aggregators, receive the aggregators of data by mapping path and is reflected
Result is penetrated to be preserved.
4. the keyword effect analysis method according to claim 1 based on big data, it is characterised in that:In step C, meter
The entropy for calculating each browse path after merging, the browse path to entropy higher than second threshold are deleted.
5. the keyword effect analysis method according to claim 4 based on big data, it is characterised in that:It is right in step C
Hinged node in the browse path of reservation is traversed, and the characteristic function F (x, y) of hinged node is obtained, and wherein x is from outside
It is directed toward the link of the hinged node, y is from the link outside hinged node direction;Use statistics obtained browsing time and clear
Characteristic function F (x, y) is marked in hit rate of looking at.
6. the keyword effect analysis method according to claim 1 based on big data, it is characterised in that:By step B and
The browsing time and browsing hit rate obtained in step C is normalized, and summation process, gained are then weighted to it
As a result directly proportional to keyword effect.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710281439.XA CN108804431A (en) | 2017-04-26 | 2017-04-26 | A kind of keyword effect analysis method based on big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710281439.XA CN108804431A (en) | 2017-04-26 | 2017-04-26 | A kind of keyword effect analysis method based on big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108804431A true CN108804431A (en) | 2018-11-13 |
Family
ID=64068882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710281439.XA Pending CN108804431A (en) | 2017-04-26 | 2017-04-26 | A kind of keyword effect analysis method based on big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108804431A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101114287A (en) * | 2006-07-27 | 2008-01-30 | 国际商业机器公司 | Method and device for generating browsing paths for data and method for browsing data |
US20100161406A1 (en) * | 2008-12-23 | 2010-06-24 | Motorola, Inc. | Method and Apparatus for Managing Classes and Keywords and for Retrieving Advertisements |
CN103607496A (en) * | 2013-11-15 | 2014-02-26 | 中国科学院深圳先进技术研究院 | A method and an apparatus for deducting interests and hobbies of handset users and a handset terminal |
CN103744869A (en) * | 2013-12-18 | 2014-04-23 | 天脉聚源(北京)传媒科技有限公司 | Method, device and browser for displaying hotspot keyword |
-
2017
- 2017-04-26 CN CN201710281439.XA patent/CN108804431A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101114287A (en) * | 2006-07-27 | 2008-01-30 | 国际商业机器公司 | Method and device for generating browsing paths for data and method for browsing data |
US20100161406A1 (en) * | 2008-12-23 | 2010-06-24 | Motorola, Inc. | Method and Apparatus for Managing Classes and Keywords and for Retrieving Advertisements |
CN103607496A (en) * | 2013-11-15 | 2014-02-26 | 中国科学院深圳先进技术研究院 | A method and an apparatus for deducting interests and hobbies of handset users and a handset terminal |
CN103744869A (en) * | 2013-12-18 | 2014-04-23 | 天脉聚源(北京)传媒科技有限公司 | Method, device and browser for displaying hotspot keyword |
Non-Patent Citations (1)
Title |
---|
彭朝晖 等: "S-CBR:基于数据库模式展现数据库关键词检索结果", 《软件学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110753064B (en) | Machine learning and rule matching fused security detection system | |
CN105930727B (en) | Reptile recognition methods based on Web | |
CN104050178B (en) | A kind of anti-cheat method of Internet surveillance and device | |
CN103927307B (en) | A kind of method and apparatus of identification website user | |
US9223968B2 (en) | Determining whether virtual network user is malicious user based on degree of association | |
CN101414939B (en) | Internet application recognition method based on dynamical depth package detection | |
CN103559235B (en) | A kind of online social networks malicious web pages detection recognition methods | |
EP2530874B1 (en) | Method and apparatus for detecting network attacks using a flow based technique | |
CN106453438B (en) | Network attack identification method and device | |
EP1918832A2 (en) | Session based web usage reporter | |
CN111143415B (en) | Data processing method, device and computer readable storage medium | |
CN107483488A (en) | A kind of malice Http detection methods and system | |
CN105281973A (en) | Webpage fingerprint identification method aiming at specific website category | |
CN103746982B (en) | A kind of http network condition code automatic generation method and its system | |
CN107302534A (en) | A kind of DDoS network attack detecting methods and device based on big data platform | |
CN110708339B (en) | Correlation analysis method based on WEB log | |
CN104348642B (en) | A kind of garbage information filtering method and device | |
CN107578263A (en) | A kind of detection method, device and the electronic equipment of advertisement abnormal access | |
CN109275045B (en) | DFI-based mobile terminal encrypted video advertisement traffic identification method | |
CN106878314A (en) | Network malicious act detection method based on confidence level | |
CN106802904A (en) | Log processing method, apparatus and system | |
CN108289125A (en) | TCP sessions recombination based on Stream Processing and statistical data extracting method | |
CN108055227B (en) | WAF unknown attack defense method based on site self-learning | |
CN109361575A (en) | A kind of method and its system obtaining analysis DNS data on flows | |
CN109981389A (en) | Phone number recognition methods, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20181113 |