WO2015067154A1 - 用于网页页面点击量统计的数据处理方法和装置 - Google Patents

用于网页页面点击量统计的数据处理方法和装置 Download PDF

Info

Publication number
WO2015067154A1
WO2015067154A1 PCT/CN2014/090189 CN2014090189W WO2015067154A1 WO 2015067154 A1 WO2015067154 A1 WO 2015067154A1 CN 2014090189 W CN2014090189 W CN 2014090189W WO 2015067154 A1 WO2015067154 A1 WO 2015067154A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
click
webpage
density
vector
Prior art date
Application number
PCT/CN2014/090189
Other languages
English (en)
French (fr)
Inventor
刘合翔
何鑫
Original Assignee
北京国双科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京国双科技有限公司 filed Critical 北京国双科技有限公司
Priority to US15/033,953 priority Critical patent/US10083251B2/en
Publication of WO2015067154A1 publication Critical patent/WO2015067154A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/358Browsing; Visualisation therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Definitions

  • the present invention relates to the field of data processing, and in particular to a data processing method and apparatus for webpage page hit statistics.
  • the click hotspot technology of the webpage page is mostly associated with the visual display of the hotspot and the information of the hotspot area
  • the main implementation means of the webpage click hotspot technology are as follows: (1) constructing the coordinate system of the webpage page; (2) Recording the click coordinate position; (3) associating the click coordinate position with related information, however, the above-described scheme of the related art cannot perform sub-area statistics on the click status of the web page.
  • the main purpose of the present invention is to provide a data processing method and apparatus for webpage page click statistics, which solves the problem that the related art cannot perform sub-area statistics on the click status of the webpage page.
  • a data processing method for web page hit count statistics includes: obtaining a coordinate system of the webpage to be monitored; recording a click amount on the webpage page through the coordinate system; determining a hotspot area on the webpage page by the click amount; and counting the number of the hotspot area determined by the statistics.
  • determining, by the click quantity, the hotspot area on the webpage page includes: obtaining an area where the density of the click volume is greater than a predetermined threshold; and using the acquired area as the hotspot area.
  • obtaining an area in which the density of the click volume is greater than a predetermined threshold includes: selecting one point in a two-dimensional plane space of the web page, and then rounding the circle with a point, and r is a circle for the radius, wherein all the objects falling within the circle The clicked point and the center of the circle will produce a vector with the center of the circle as the starting point, with the clicked point falling within the circle as the end point, the vector including one or more; adding one or more vectors to get the mean shifting Meanshift Vector; determining whether the modulus of the Meanshift vector is less than a preset extreme value; when determining that the modulus of the Meanshift vector is less than a preset extreme value, obtaining the density of the click volume and the independent continuous region through the Meanshift vector; determining whether the density of the click volume is greater than a predetermined value a threshold; and an area in which the density of the clicks in the independent continuous area is greater than a predetermined threshold.
  • the density of the clicks is obtained by acquiring the number of clicks of each area in the independent continuous area; and dividing the number of clicks by the corresponding area to obtain the click density of the corresponding area.
  • the independent continuous area is obtained by classifying and summarizing the center coordinates of the web page, and recording the area formed by the set of center coordinates pointing to the same end point in the web page as an independent continuous area.
  • a data processing apparatus for web page click amount statistics includes: an obtaining unit, configured to acquire a coordinate system of the webpage to be monitored; a recording unit, configured to record a click amount on the webpage page by using a coordinate system; and a determining unit, configured to determine a hotspot area on the webpage page by the click amount; A statistical unit that is used to count the number of hotspot areas determined.
  • the determining unit includes: an obtaining module, configured to acquire an area where the density of the click volume is greater than a predetermined threshold; and a determining module, configured to use the acquired area as the hotspot area.
  • the obtaining module includes: a vector processing sub-module, configured to select a point in a two-dimensional plane space of the web page, and then use a point as a center and r as a radius to make a circle, wherein all the objects falling within the circle are The clicked point and the center of the circle will produce a vector with the center of the circle as the starting point, with the clicked point falling within the circle as the end point, the vector including one or more; the mean submodule for adding multiple vectors, Obtaining a mean shifting Meanshift vector; the first determining sub-module is configured to determine whether the modulus of the Meanshift vector is less than a preset extreme value; and the first obtaining sub-module is configured to: when determining that the modulus of the Meanshift vector is less than a preset extreme value, The Meanshift vector obtains the density of the click volume and the independent continuous area; the second determining sub-module is configured to determine whether the density of the click quantity is greater than a predetermined threshold; and the second obtaining sub-
  • the obtaining module is configured to obtain the number of clicks of each area in the independent continuous area, and divide the number of clicks by the corresponding area to obtain the click density of the corresponding area.
  • the obtaining module is configured to classify and summarize the center coordinates of the web page, and record the area formed by the set of center coordinates pointing to the same end point in the web page as an independent continuous area.
  • the coordinate system for acquiring the webpage to be monitored is used; the click amount on the webpage page is recorded by the coordinate system; the hotspot area on the webpage page is determined by the click volume; and the number of hotspot regions determined by the statistics is solved, and the related technology is solved.
  • the problem of sub-regional statistics on the clicks of webpage pages cannot be performed, and the effect of automatically counting the number of hotspots clicked on the webpage pages is achieved.
  • FIG. 1 is a schematic diagram of a data processing apparatus for webpage page click statistics according to a first embodiment of the present invention
  • FIG. 2 is a schematic diagram of a data processing apparatus for webpage page click statistics according to a second embodiment of the present invention
  • FIG. 3 is a flowchart of a data processing method for webpage page click statistics according to a first embodiment of the present invention
  • FIG. 4 is a flow chart of a data processing method for webpage page click statistics according to a second embodiment of the present invention.
  • a data processing apparatus for webpage page click statistics is provided, and the apparatus is configured to count the amount of clicks of each pixel on a webpage page to obtain the number of clicked hotspot areas of the webpage page.
  • FIG. 1 is a schematic diagram of a data processing apparatus for webpage page click statistics according to a first embodiment of the present invention.
  • the apparatus includes an acquisition unit 10, a recording unit 20, a determination unit 30, and a statistics unit 40.
  • the obtaining unit 10 is configured to acquire a coordinate system of the webpage to be monitored.
  • the webpage may be a webpage under multiple platforms and a webpage of a plurality of browsers, and the coordinate system may be an orthogonal rectangular coordinate system.
  • the acquiring unit 10 is configured to acquire a coordinate system of the web page to be monitored, including acquiring a coordinate origin of the orthogonal rectangular coordinate system, a horizontal axis of the coordinate (ie, an X axis), a positive direction thereof, a vertical axis of the coordinate (ie, a Y axis), and a positive direction thereof.
  • Unit length in which the point in the upper left corner of the webpage page can be set as the coordinate origin, and the direction along the horizontal to the right of the webpage page is set to the positive direction of the horizontal axis of the coordinate line, and the direction along the vertical direction of the webpage page is set to the vertical axis of the coordinate
  • the direction, the unit length may be 1 nm or 1 um, etc., the unit length may be determined according to the accuracy of the coordinates, and the coordinate system of the monitored web page obtained by the obtaining unit 10 may acquire the coordinates of any point in the page of the monitored web page.
  • the unit length corresponds to a unit area, and each unit area corresponds to a set of one pixel point, and the unit length setting determines the number of pixel points in the unit area, so that the unit area can be For the basis of counting, the amount of clicks is recorded by the number of pixels clicked per unit area.
  • the unit length may also be a pixel (pixel, abbreviated as px) unit.
  • the recording unit 20 is for recording the amount of clicks on the web page by the coordinate system. It should be noted that the recording unit 20 may be configured to record, by using a coordinate system, a click amount on a webpage page in a preset time period, where the click volume is a click amount of a pixel corresponding to a different area in the webpage page, instead of a webpage page. Overall clicks.
  • the recording unit 20 may include one or more recording modules, wherein the recording module may include a counter.
  • the webpage page has an infinite number of points, and each point can correspond to a recording module through its coordinates.
  • the point in the webpage page is clicked, and once the user clicks on the point in the webpage page, the coordinates of the point are The corresponding recording module will increase by 1. Otherwise, when the user clicks on the above point, the recording module corresponding to the coordinates of the above point will remain unchanged, so that different recording modules can record the webpage within the preset time period. The number of clicks at different points on the page.
  • the determining unit 30 is configured to determine a hot spot area on the webpage page by the click amount.
  • hotspots refer to news or information that is more concerned or welcomed by the general public, or refers to a place or problem that attracts attention in a certain period.
  • the hotspot area on the webpage page refers to an area of a webpage page with a relatively large number of clicks or a relatively large click density.
  • the hotspot area may be a webpage page area whose click amount exceeds a preset value.
  • the statistic unit 40 is configured to count the number of determined hot spot regions.
  • the statistics unit 40 may be a counter or a hash table.
  • the webpage interface is divided into different areas, so that when the statistical unit 40 is a counter, if it is determined that a certain area on the webpage interface is a hotspot area, the counter count is increased by 1, otherwise, if If it is determined that a certain area on the webpage interface is not a hotspot area, the counter is kept unchanged; when the statistic unit 40 is a hash table, the hotspot area may be a keyword of the hash table, and the number of the hotspot area The hash value of the hash table, so that when it is determined that an area on the webpage interface is a hotspot area, it is determined whether the hotspot area is a keyword of the hash table, and if the hotspot area is a key of the hash table Word, the hash value of the hash table remains unchanged.
  • the hash table keyword of the hotspot area is added to the hash table, and the hash table is hashed.
  • the hash value is increased by 1. Otherwise, if it is determined that an area on the web interface is not a hot spot area, the hash value of the hash table remains unchanged.
  • the acquisition unit 10 is used to acquire the coordinates of each point in the webpage page coordinate system; the click amount of each point in the preset time period is recorded by the recording unit 20; and the webpage page is determined by the determining unit 30 according to the size of the click amount.
  • the hotspot area; and the number of hotspot areas on the webpage page counted by the statistic unit 40 solves the problem that the related art lacks the click statistics of each point in the webpage page, thereby achieving the hotspot clicked on the automatic statistics webpage page. The effect of the number of areas.
  • FIG. 2 is a schematic diagram of a data processing apparatus for webpage page click statistics according to a second embodiment of the present invention.
  • the data processing apparatus for webpage page click statistics of the embodiment includes the obtaining unit 10 and the recording unit 20 of the first embodiment.
  • the functions of the obtaining unit 10, the recording unit 20, and the statistic unit 40 are the same as those in the first embodiment, and are not described herein again.
  • the obtaining module 301 is configured to obtain an area where the density of the clicks is greater than a predetermined threshold.
  • the obtaining module 301 can be configured to obtain an area where the density of the click quantity is greater than a predetermined threshold by comparing the density of the click amount of each point with a preset threshold, when the click amount of the point in a certain area on the webpage page
  • the area is an area where the density of the acquired clicks is greater than a predetermined threshold
  • the area where the density of the clicks is greater than a predetermined threshold is a hotspot area, and the hotspot area is clicked by each point on the webpage page.
  • the density and the preset threshold are determined, wherein the density of the click volume of each point on the webpage determines the location of the hotspot area, and the preset threshold determines the size of the hotspot area.
  • the preset threshold determines the size of the hotspot area.
  • the acquiring module 301 includes a vector processing submodule, a mean submodule, a first judging submodule, a first obtaining submodule, a second judging submodule, and a second obtaining submodule.
  • the vector processing sub-module is configured to select a point in the two-dimensional plane space of the webpage page, and use the point as the center of the circle to make a circle with the radius of r, wherein all the clicked points and the center of the circle falling within the circle are generated. A vector that starts at the center of the circle and ends with the clicked point that falls within the circle.
  • the vector can include one or more.
  • the mean submodule is used to add one or more vectors to obtain a mean shift Meanshift vector.
  • Mean shift is an effective iterative statistical algorithm
  • the Meanshift vector is a vector obtained by an iterative statistical algorithm.
  • the first determining sub-module is configured to determine whether the modulus of the Meanshift vector is smaller than a preset maximum value, wherein the preset extreme value may be a sufficiently small number, so that when the first determining sub-module determines that the mode of the Meanshift vector is not less than a preset In the extreme value, the vector processing sub-module is also used to make the circle with the end point of the Meanshift vector and the radius of r, wherein all the clicked points and the center of the circle that fall within the circle will generate another vector, the vector is Starting from the end point of the Meanshift vector, ending with the clicked point falling within the circle, the vector may also include one or more.
  • the mean submodule is also used to add one or more vectors to obtain another mean shifting Meanshift vector, which continues until the first determining submodule determines that the mode of the Meanshift vector is less than a preset extreme value;
  • the judgment sub-module determines that the modulus of the Meanshift vector is smaller than the preset extremum
  • the first acquisition sub-module obtains the density of the click volume and the independent continuous region through the Meanshift vector.
  • the number of clicks of each area in the independent continuous area may be obtained, the click quantity is divided by the corresponding area to obtain the click volume density of the corresponding area, and the center coordinate position in the webpage page may be performed.
  • the second determining sub-module is configured to determine whether the density of the click amount is greater than a predetermined threshold.
  • the second acquisition sub-module is configured to obtain an area in which the density of the clicks is greater than a predetermined threshold in the independent continuous area.
  • the determining module 302 is configured to use the acquired area as a hotspot area.
  • the acquisition module 301 compares the click amount with the predetermined threshold to obtain an area where the click amount is greater than a predetermined threshold, and determines the area as a hot spot area by the determining module 302, and passes the first calculating unit. 40 and the second calculating unit 50 calculate the hotspot density of the webpage page, and solve the problem that the click statistics of the various points of the webpage page are lacking in the related art, thereby achieving the effect of automatically counting the hotspot density of the clickpage of the webpage.
  • a data processing method for webpage page click statistics is provided, which is used for counting the click amount of each pixel on a webpage page to obtain the number of clicked hotspot areas of the webpage page.
  • the data processing method for webpage page hit statistics can be run on a computer processing device. It should be noted that the data processing method for the webpage page click statistics provided by the embodiment of the present invention may be performed by the data processing apparatus for webpage page click statistics according to the embodiment of the present invention, which is used by the embodiment of the present invention.
  • the data processing device for the page page traffic statistics of the web page may also be used to execute the data processing method for the web page click volume statistics according to the embodiment of the present invention.
  • FIG. 3 is a flow chart of a data processing method for web page hit count statistics according to a first embodiment of the present invention.
  • the method includes the following steps S101 to S104:
  • Step S101 Obtain a coordinate system of the webpage to be monitored.
  • the webpage may be a webpage under multiple platforms and multiple browsers
  • the coordinate system may be an orthogonal rectangular coordinate system.
  • the acquired coordinate system of the scanned webpage page includes the coordinate origin of acquiring the orthogonal rectangular coordinate system, the horizontal axis of the coordinate (ie, the X axis) and its positive direction, the vertical axis of the coordinate (ie, the Y axis), and the positive direction and unit length thereof, wherein You can set the point in the upper left corner of the web page as the coordinate origin, set the horizontal right direction of the web page as the horizontal direction of the coordinate horizontal axis, and set the direction along the vertical direction of the web page as the positive direction of the vertical axis of the web page.
  • the unit length corresponds to a unit area, and each unit area corresponds to a set of one pixel point, and the unit length setting determines the number of pixel points in the unit area, so that the unit area can be For the basis of counting, the amount of clicks is recorded by the number of pixels clicked per unit area.
  • the unit length may also be a pixel (pixel, abbreviated as px) unit.
  • Step S102 the amount of clicks on the webpage page is recorded by the coordinate system.
  • the amount of clicks on the webpage page in the preset time period can be recorded by the coordinate system, wherein the click volume is the click volume of the pixel corresponding to the different area in the webpage page, instead of the overall click volume of the webpage page.
  • the web page has an infinite number of points. Each point can correspond to a counter through its coordinates. When the user browses the web page, the point in the web page is clicked. Once the user clicks on the point in the web page, the coordinates correspond to the point coordinates. The counter will increase by 1. Otherwise, when the user clicks on the above point, the counter corresponding to the coordinates of the above point will remain unchanged, so that the click amount of different points on the webpage page can be recorded within the preset time period. .
  • Step S103 determining a hot spot area on the webpage page by the click amount.
  • hotspots refer to news or information that is more concerned or welcomed by the general public, or refers to a place or problem that attracts attention in a certain period.
  • the hotspot area on the webpage page is a pointing point.
  • the hotspot area may be a web page area with a click amount exceeding a preset value.
  • step S104 the number of determined hot spot areas is counted.
  • the number of hotspot areas on the webpage page may be counted by a counter or a hash table.
  • the webpage interface is divided into different areas, so that when the counter is used to count the number of hotspot areas, if it is determined that a certain area on the webpage interface is a hotspot area, the counter count is increased by 1, Otherwise, if it is determined that an area on the web interface is not a hotspot area, the counter count remains unchanged.
  • the hotspot area may be a keyword of the hash table, and the number of the hotspot area is a hash value of the hash table, so that when the webpage interface is When a certain area is a hotspot area, it is determined whether the hotspot area is a keyword of the hash table. If the hotspot area is a keyword of the hash table, the hash value of the hash table remains unchanged, if the hotspot area is not The hash table keyword is added to the hash table keyword in the hash table, and the hash value of the hash table is increased by 1. Otherwise, if an area on the web interface is not a hot spot area, The hash value of the Greek table remains unchanged.
  • the coordinates of each point in the coordinate system of the webpage page are acquired; the click amount of each point in the preset time period is recorded; and the hotspot area of the webpage page is determined according to the size of the click volume; and the hotspot area on the webpage page is counted.
  • the number of the problem solves the problem that the related art lacks the statistics on the clicks of various points in the webpage page, thereby achieving the effect of automatically counting the number of hotspots clicked on the webpage page.
  • FIG. 4 is a flow chart of a data processing method for webpage page click statistics according to a second embodiment of the present invention.
  • the data processing method for webpage page click statistics includes the following steps S201 to S205, which may be used as a preferred embodiment of the embodiment shown in FIG.
  • Step S201 and step S202 are respectively the same as step S101 and step S102 of the embodiment shown in FIG. 3, and details are not described herein again.
  • Step S203 Acquire an area where the density of the click amount is greater than a predetermined threshold.
  • it may be used to obtain an area in a certain area on a webpage by comparing the density of the click amount of each point with a preset threshold to obtain an area where the density of the click quantity is greater than a predetermined threshold. If the density of the clicks is greater than the preset threshold, the area is the area where the density of the acquired clicks is greater than a predetermined threshold, and the area where the density of the clicks is greater than the predetermined threshold is the hotspot area, and the hotspot area is determined by each page on the webpage.
  • the density of the clicks of the points and the preset threshold are determined, wherein the density of the clicks at various points on the web page determines the hotspot area The location of the domain, and the preset threshold determines the size of the hotspot. When the preset threshold is large, the hotspot is smaller. Otherwise, the hotspot is larger when the preset threshold is smaller.
  • an area where the density of the click amount is greater than a predetermined threshold may be obtained by the following steps:
  • Step 1 Select a point in the two-dimensional plane space of the webpage page, and use the point as the center of the circle to make a circle with the radius of r, wherein all the clicked points and the center of the circle falling within the circle generate a vector.
  • the vector is centered on the center of the circle, with the clicked point falling within the circle as the end point, and the vector may include one or more.
  • step 2 one or more vectors are added to obtain a mean shift Meanshift vector, wherein the mean shift is an effective iterative statistical algorithm, and the Meanshift vector is a vector obtained by an iterative statistical algorithm.
  • Step 3 Determine whether the modulus of the Meanshift vector is smaller than a preset maximum value, wherein the preset extreme value may be a sufficiently small number, so that when it is determined that the mode of the Meanshift vector is not less than the preset extreme value, step 1 is sequentially performed. And step 2 until it is determined that the modulus of the Meanshift vector is less than the preset extreme value.
  • step 1 the end point of the Meanshift vector is taken as the center of the circle, and r is used as the radius, wherein all the clicked points and the center of the circle falling within the circle generate another vector, the vector The end point of the Meanshift vector is taken as the starting point, and the clicked point falling within the circle is the end point, and the vector may also include one or more.
  • step 4 when it is determined that the modulus of the Meanshift vector is smaller than the preset extreme value, the density of the click amount and the independent continuous region are obtained by the Meanshift vector.
  • the number of clicks of each area in the independent continuous area may be obtained, the click quantity is divided by the corresponding area to obtain the click volume density of the corresponding area, and the center coordinate position in the webpage page may be performed.
  • the sub-group summarizes the area formed by the set of center coordinates pointing to the same end point in the web page as an independent continuous area; determines whether the density of the click quantity is greater than a predetermined threshold; and obtains an area in which the density of the click quantity is greater than a predetermined threshold in the independent continuous area.
  • step S204 the acquired area is taken as a hotspot area.
  • Step S205 is the same as step S104 of the embodiment shown in FIG. 3, and details are not described herein again.
  • an area in which the click amount is greater than a predetermined threshold is obtained by comparing the click amount with a predetermined threshold, and the area is determined as a hot spot area, and the number of hotspot areas on the web page is counted.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device, or they may be fabricated into individual integrated circuit modules, or Implementing multiple modules or steps in them as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明公开了一种用于网页页面点击量统计的数据处理方法和装置。该用于网页页面点击量统计的数据处理方法包括:获取被监测网页页面的坐标系;通过坐标系记录网页页面上的点击量;通过点击量确定网页页面上的热点区域;以及统计确定的热点区域的个数。通过本发明,解决了相关技术中无法对网页页面的点击情况进行分区域统计的问题,进而达到了自动统计网页页面中点击的热点区域的个数的效果。

Description

用于网页页面点击量统计的数据处理方法和装置 技术领域
本发明涉及数据处理领域,具体而言,涉及一种用于网页页面点击量统计的数据处理方法和装置。
背景技术
目前,在相关技术中,网页页面的点击热点技术多与热点的视觉展现及热点区域的信息关联,并且网页页面点击热点技术的主要实现手段如下:(1)构建网页页面的坐标系;(2)记录点击坐标位置;(3)将点击坐标位置与相关信息进行关联,然而,相关技术的上述方案无法对网页页面的点击情况进行分区域统计。
针对相关技术中无法对网页页面的点击情况进行分区域统计的问题,目前尚未提出有效的解决方案。
发明内容
本发明的主要目的在于提供一种用于网页页面点击量统计的数据处理方法和装置,以解决相关技术中无法对网页页面的点击情况进行分区域统计的问题。
为了实现上述目的,根据本发明的一个方面,提供了一种用于网页页面点击量统计的数据处理方法。该方法包括:获取被监测网页页面的坐标系;通过坐标系记录网页页面上的点击量;通过点击量确定网页页面上的热点区域;以及统计确定的热点区域的个数。
进一步地,通过点击量确定网页页面上的热点区域包括:获取点击量的密度大于预定阈值的区域;以及将获取的区域作为热点区域。
进一步地,获取点击量的密度大于预定阈值的区域包括:在网页页面的二维平面空间中,任选一个点,然后以点为圆心,r为半径做圆,其中,落在圆内的所有被点击点与圆心都会产生一个向量,向量是以圆心为起点,以落在圆内的被点击的点为终点,向量包括一个或者多个;将一个或者多个向量相加,得到均值飘移Meanshift向量;判断Meanshift向量的模是否小于预设极值;当判断出Meanshift向量的模小于预设极值时,则通过Meanshift向量获取点击量的密度和独立连续区域;判断点击量的密度是否大于预定阈值;以及在独立连续区域中获取点击量的密度大于预定阈值的区域。
进一步地,通过以下方式获取点击量的密度:获取独立连续区域中各个区域的点击次数;以及将点击次数除以对应的区域得到对应的区域的点击量密度。
进一步地,通过以下方式得到独立连续区域:对网页页面内的圆心坐标位置进行分类汇总;以及将网页页面内指向相同终点的圆心坐标集合所构成的区域记录为独立连续区域。
为了实现上述目的,根据本发明的另一方面,提供了一种用于网页页面点击量统计的数据处理装置。该装置包括:获取单元,用于获取被监测网页页面的坐标系;记录单元,用于通过坐标系记录网页页面上的点击量;确定单元,用于通过点击量确定网页页面上的热点区域;统计单元,用于统计确定的热点区域的个数。
进一步地,确定单元包括:获取模块,用于获取点击量的密度大于预定阈值的区域;以及确定模块,用于将获取的区域作为热点区域。
进一步地,获取模块包括:向量处理子模块,用于在网页页面的二维平面空间中,任选一个点,然后以点为圆心,r为半径做圆,其中,落在圆内的所有被点击的点与圆心都会产生一个向量,向量是以圆心为起点,以落在圆内的被点击的点为终点,向量包括一个或者多个;均值子模块,用于将多个向量相加,得到均值飘移Meanshift向量;第一判断子模块,用于判断Meanshift向量的模是否小于预设极值;第一获取子模块,用于当判断出Meanshift向量的模小于预设极值时,则通过Meanshift向量获取点击量的密度和独立连续区域;第二判断子模块,用于判断点击量的密度是否大于预定阈值;以及第二获取子模块,用于在独立连续区域中获取点击量的密度大于预定阈值的区域。
进一步地,获取模块用于获取独立连续区域中各个区域的点击次数,将点击次数除以对应的区域得到对应的区域的点击量密度。
进一步地,获取模块用于对网页页面内的圆心坐标位置进行分类汇总,将网页页面内指向相同终点的圆心坐标集合所构成的区域记录为独立连续区域。
通过本发明,采用获取被监测网页页面的坐标系;通过坐标系记录网页页面上的点击量;通过点击量确定网页页面上的热点区域;以及统计确定的热点区域的个数,解决了相关技术中无法对网页页面的点击情况进行分区域统计的问题,进而达到了自动统计网页页面中点击的热点区域的个数的效果。
附图说明
构成本申请的一部分的附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明第一实施例的用于网页页面点击量统计的数据处理装置的示意图;
图2是根据本发明第二实施例的用于网页页面点击量统计的数据处理装置的示意图;
图3是根据本发明第一实施例的用于网页页面点击量统计的数据处理方法的流程图;以及
图4是根据本发明第二实施例的用于网页页面点击量统计的数据处理方法的流程图。
具体实施方式
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。
为了使本领域的技术人员更好的理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,在本领域普通技术人员没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明的保护范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含。
根据本发明的实施例,提供了一种用于网页页面点击量统计的数据处理装置,该装置用于统计网页页面上各个像素点的点击量以得到网页页面的点击的热点区域的个数。
图1是根据本发明第一实施例的用于网页页面点击量统计的数据处理装置的示意图。
如图1所示,该装置包括:获取单元10、记录单元20、确定单元30和统计单元40。
获取单元10用于获取被监测网页页面的坐标系。在本发明实施例中,网页可以是多种平台下的网页和多种浏览器中的网页,坐标系可以是正交直角坐标系。获取单元10用于获取被监测网页页面的坐标系包括获取正交直角坐标系的坐标原点、坐标横轴(即X轴)及其正方向、坐标纵轴(即Y轴)及其正方向和单位长度,其中,可以将网页页面的左上角的点设置为坐标原点,将沿网页页面水平向右的方向设置为坐标横轴正方向,将沿网页页面垂直向上的方向设置为坐标纵轴正方向,单位长度可以是1nm或者是1um等,该单位长度可以根据坐标的精度确定,通过获取单元10获取的被监测网页页面的坐标系可以获取被监测网页页面内任意点的坐标。需要说明的是,在本发明实施例中,单位长度对应单位面积,每个单位面积对应一个像素点的集合,单位长度的设定决定单位面积内像素点的个数,这样,可以以单位面积为计数基础,通过单位面积内被点击的像素的个数记录点击量。在本发明实施例中,单位长度还可以为1像素(pixel,简称px)单位,这样,在网页页面上,每个坐标对应着一个像素点,可以以坐标为计数基础,通过坐标对应的被点击的像素的个数记录点击量。
记录单元20用于通过坐标系记录网页页面上的点击量。需要说明的是,记录单元20可以用于通过坐标系记录预设时间段内网页页面上的点击量,其中,该点击量为网页页面中不同区域对应的像素点的点击量,而不是网页页面的整体点击量。在本发明实施实例中,记录单元20可以包括一个或者多个记录模块,其中,记录模块可以包括计数器。网页页面有无数个点组成,每个点通过其坐标可以对应一个记录模块,当用户浏览网页页面时,会点击该网页页面内的点,一旦用户点击一次网页页面内的点,与该点坐标对应的记录模块就会增加1,否则,当用户点击的不是上述点时,与上述点的坐标对应的记录模块会保持不变,这样,在预设时间段内,不同的记录模块可以记录网页页面上不同点的点击量。
确定单元30用于通过点击量确定网页页面上的热点区域。需要说明的是,热点是指比较受广大群众关注或者欢迎的新闻或者信息,或指某个时期引人注目的地方或者问题。在本发明实施例中,网页页面上的热点区域是指点击次数比较多或者点击密度比较大的网页页面的区域,具体地,热点区域可以是点击量超过预设值的网页页面区域。
统计单元40用于统计确定的热点区域的个数。在本发明实施例中,统计单元40可以为计数器或者哈希表。在本发明实施实例中,将网页界面划分成不同的区域,这样,当统计单元40为计数器时,如果确定出网页界面上的某个区域为热点区域,则计数器的计数增加1,否则,如果确定出网页界面上的某个区域不为热点区域,则计数器的计数保持不变;当统计单元40为哈希表时,其中,热点区域可以为哈希表的关键字,热点区域的个数为哈希表的哈希值,这样,当确定出网页界面上的某个区域为热点区域时,再判断该热点区域是否为哈希表的关键字,如果该热点区域为哈希表的关键字,则哈希表的哈希值保持不变,如果该热点区域不为哈希表的关键字,则在哈希表中增加该热点区域的哈希表关键字,并且哈希表的哈希值增加1,否则,如果确定出网页界面上的某个区域不为热点区域时,哈希表的哈希值保持不变。
通过本发明,采用获取单元10以获取网页页面坐标系下的各个点的坐标;通过记录单元20记录各个点在预设时间段的点击量;并通过确定单元30根据点击量的大小确定网页页面的热点区域;以及通过统计单元40统计网页页面上的热点区域的个数,解决了相关技术中缺少对网页页面中各个点的点击量统计的问题,进而达到了自动统计网页页面中点击的热点区域的个数的效果。
图2是根据本发明第二实施例的用于网页页面点击量统计的数据处理装置的示意图。
如图2所示,该实施例可以作为图1所示实施例的优选实施方式,该实施例的用于网页页面点击量统计的数据处理装置包括第一实施例的获取单元10、记录单元20、确定单元30和统计单元40,其中,确定单元30包括获取模块301和确定模块302。
获取单元10、记录单元20和统计单元40的作用与第一实施例中的相同,在此不再赘述。
获取模块301用于获取点击量的密度大于预定阈值的区域。具体地,获取模块301可以用于通过将各个点的点击量的密度与预设阈值对比以获取点击量的密度大于预定阈值的区域,当网页页面上的某个区域内的点的点击量的密度均大于预设阈值时,则该区域为获取的点击量的密度大于预定阈值的区域,该点击量的密度大于预定阈值的区域即为热点区域,热点区域由网页页面上各个点的点击量的密度和预设阈值确定,其中,网页页面上各个点的点击量的密度确定热点区域的位置,而预设阈值确定热点区域的大小,当预设阈值较大时,热点区域较小,否则预设阈值较小时,热点区域较大。
在本发明实施例中,具体地,获取模块301包括向量处理子模块、均值子模块、第一判断子模块、第一获取子模块、第二判断子模块和第二获取子模块。向量处理子模块用于在网页页面的二维平面空间中任选一个点,并且以该点为圆心,以r为半径做圆,其中,落在圆内的所有被点击的点与圆心都会产生一个向量,该向量是以圆心为起点,以落在圆内的被点击的点为终点,该向量可以包括一个或者多个。均值子模块用于将一个或者多个向量相加以得到均值飘移Meanshift向量,其中,均值飘移是一种有效的迭代统计算法,Meanshift向量是由迭代统计算法得到的向量。第一判断子模块用于判断Meanshift向量的模是否小于预设极值,其中,预设极值可以为足够小的数,这样,当第一判断子模块判断出Meanshift向量的模不小于预设极值时,向量处理子模块还用于以该Meanshift向量的终点为圆心,以r为半径做圆,其中,落在圆内的所有被点击的点与圆心都会产生另一个向量,该向量是以Meanshift向量的终点为起点,以落在圆内的被点击的点为终点,该向量也可以包括一个或者多个。均值子模块还用于将一个或者多个向量相加以得到另一均值飘移Meanshift向量,这种情况持续进行,直到第一判断子模块判断出Meanshift向量的模小于预设极值为止;当第一判断子模块判断出Meanshift向量的模小于预设极值时,则由第一获取子模块通过Meanshift向量获取点击量的密度和独立连续区域。在本发明实施例中,具体地,可以获取独立连续区域中各个区域的点击次数,将点击次数除以对应的区域得到对应的区域的点击量密度,以及可以对网页页面内的圆心坐标位置进行分类汇总,将网页页面内指向相同终点的圆心坐标集合所构成的区域记录为独立连续区域。第二判断子模块用于判断点击量的密度是否大于预定阈值。第二获取子模块用于在独立连续区域中获取点击量的密度大于预定阈值的区域。
确定模块302用于将获取的区域作为热点区域。
这样,在本发明实施例中,通过获取模块301将点击量和预定阈值进行对比以获取点击量大于预定阈值的区域,并通过确定模块302将上述区域确定为热点区域,以及通过第一计算单元40和第二计算单元50计算出网页页面的热点密度,解决了相关技术中缺少对网页页面各个点的点击量统计的问题,进而达到了自动统计网页页面点击的热点密度的效果。
根据本发明的实施例,提供了一种用于网页页面点击量统计的数据处理方法,该方法用于统计网页页面上各个像素点的点击量以得到网页页面的点击的热点区域的个数。该用于网页页面点击量统计的数据处理方法可以运行在计算机处理设备上。需要说明的是,本发明实施例所提供的用于网页页面点击量统计的数据处理方法可以通过本发明实施例的用于网页页面点击量统计的数据处理装置来执行,本发明实施例的用 于网页页面点击量统计的数据处理装置也可以用于执行本发明实施例的用于网页页面点击量统计的数据处理方法。
图3是根据本发明第一实施例的用于网页页面点击量统计的数据处理方法的流程图。
如图3所示,该方法包括如下的步骤S101至步骤S104:
步骤S101,获取被监测网页页面的坐标系。
在本发明实施例中,网页可以是多种平台下和多种浏览器下的网页,坐标系可以是正交直角坐标系。获取的被监测网页页面的坐标系包括获取正交直角坐标系的坐标原点、坐标横轴(即X轴)及其正方向、坐标纵轴(即Y轴)及其正方向和单位长度,其中,可以将网页页面的左上角的点设置为坐标原点,将沿网页页面水平向右的方向设置为坐标横轴正方向,将沿网页页面垂直向上的方向设置为坐标纵轴正方向,单位长度可以是1nm或者是1um等,该单位长度可以根据坐标的精度确定。通过获取的被监测网页页面的坐标系可以获取被监测网页页面内任意点的坐标。需要说明的是,在本发明实施例中,单位长度对应单位面积,每个单位面积对应一个像素点的集合,单位长度的设定决定单位面积内像素点的个数,这样,可以以单位面积为计数基础,通过单位面积内被点击的像素的个数记录点击量。在本发明实施例中,单位长度还可以为1像素(pixel,简称px)单位,这样,在网页页面上,每个坐标对应一个像素点,可以以坐标为计数基础,通过坐标对应的被点击的像素的个数记录点击量。
步骤S102,通过坐标系记录网页页面上的点击量。
需要说明的是,可以通过坐标系记录预设时间段内网页页面上的点击量,其中,该点击量为网页页面中不同区域对应的像素点的点击量,而不是网页页面的整体点击量。网页页面有无数个点组成,每个点通过其坐标可以对应一个计数器,当用户浏览网页页面时,会点击该网页页面内的点,一旦用户点击一次网页页面内的点,与该点坐标对应的计数器就会增加1,否则,当用户点击的不是上述点时,与上述点的坐标对应的计数器会保持不变,这样,在预设时间段内,可以记录网页页面上不同点的点击量。
步骤S103,通过点击量确定网页页面上的热点区域。
需要说明的是,热点是指比较受广大群众关注或者欢迎的新闻或者信息,或指某个时期引人注目的地方或者问题。在本发明实施例中,网页页面上的热点区域是指点 击次数比较多或者点击密度比较大的网页页面的区域,具体地,热点区域可以是点击量超过预设值的网页页面区域。
步骤S104,统计确定的热点区域的个数。
在本发明实施例中,可以通过计数器或者哈希表统计网页页面上的热点区域的个数。在本发明实施实例中,将网页界面划分成不同的区域,这样,当使用计数器统计热点区域的个数时,如果确定出网页界面上的某个区域为热点区域,则计数器的计数增加1,否则,如果确定出网页界面上的某个区域不为热点区域,则计数器的计数保持不变。当通过哈希表统计网页页面上的热点区域的个数时,其中,热点区域可以为哈希表的关键字,热点区域的个数为哈希表的哈希值,这样,当网页界面上的某个区域为热点区域时,判断热点区域是否为哈希表的关键字,如果热点区域为哈希表的关键字,则哈希表的哈希值保持不变,如果热点区域不为哈希表的关键字,则在哈希表中增加热点区域的哈希表关键字,并且哈希表的哈希值增加1,否则,如果网页界面上的某个区域不为热点区域时,哈希表的哈希值保持不变。
通过本发明,采用获取网页页面坐标系下的各个点的坐标;记录各个点在预设时间段的点击量;并根据点击量的大小确定网页页面的热点区域;以及统计网页页面上的热点区域的个数,解决了相关技术中缺少对网页页面中各个点的点击量统计的问题,进而达到了自动统计网页页面中点击的热点区域的个数的效果。
图4是根据本发明第二实施例的用于网页页面点击量统计的数据处理方法的流程图。
如图4所示,该用于网页页面点击量统计的数据处理方法包括如下的步骤S201至步骤S205,该实施例可以作为图3所示实施例的优选实施方式。
步骤S201和步骤S202,分别同图3所示实施例的步骤S101和步骤S102,在此不再赘述。
步骤S203,获取点击量的密度大于预定阈值的区域。
具体地,在本发明实施例中,可以用于通过将各个点的点击量的密度与预设阈值对比以获取点击量的密度大于预定阈值的区域,当网页页面上的某个区域内的点的点击量的密度均大于预设阈值时,则该区域为获取的点击量的密度大于预定阈值的区域,该点击量的密度大于预定阈值的区域即为热点区域,热点区域由网页页面上各个点的点击量的密度和预设阈值确定,其中,网页页面上各个点的点击量的密度确定热点区 域的位置,而预设阈值确定热点区域的大小,当预设阈值较大时,热点区域较小,否则预设阈值较小时,热点区域较大。
在本发明实施例中,具体地,可以通过以下步骤获取点击量的密度大于预定阈值的区域:
步骤1,在网页页面的二维平面空间中任选一个点,并且以该点为圆心,以r为半径做圆,其中,落在圆内的所有被点击的点与圆心都会产生一个向量,该向量是以圆心为起点,以落在圆内的所述被点击的点为终点,该向量可以包括一个或者多个。
步骤2,将一个或者多个向量相加以得到均值飘移Meanshift向量,其中,均值飘移是一种有效的迭代统计算法,Meanshift向量是由迭代统计算法得到的向量。
步骤3,判断Meanshift向量的模是否小于预设极值,其中,预设极值可以为足够小的数,这样,当判断出Meanshift向量的模不小于预设极值时,则依次执行步骤1和步骤2,直到判断出Meanshift向量的模小于预设极值为止。需要说明的是,此时,在步骤1中以该Meanshift向量的终点为圆心,以r为半径做圆,其中,落在圆内的所有被点击的点与圆心都会产生另一个向量,该向量是以Meanshift向量的终点为起点,以落在圆内的被点击的点为终点,该向量也可以包括一个或者多个。
步骤4,当判断出Meanshift向量的模小于预设极值时,则通过Meanshift向量获取点击量的密度和独立连续区域。在本发明实施例中,具体地,可以获取独立连续区域中各个区域的点击次数,将点击次数除以对应的区域得到对应的区域的点击量密度,以及可以对网页页面内的圆心坐标位置进行分类汇总,将网页页面内指向相同终点的圆心坐标集合所构成的区域记录为独立连续区域;判断点击量的密度是否大于预定阈值;在独立连续区域中获取点击量的密度大于预定阈值的区域。
步骤S204,将获取的区域作为热点区域。
步骤S205,同图3所示实施例的步骤S104,在此不再赘述。
这样,在本发明实施例中,通过将点击量和预定阈值进行对比以获取点击量大于预定阈值的区域,并通过将上述区域确定为热点区域,以及通过统计网页页面上的热点区域的个数,解决了相关技术中缺少对网页页面中各个点的点击量统计的问题,进而达到了自动统计网页页面中点击的热点区域的个数的效果。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所 组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种用于网页页面点击量统计的数据处理方法,其特征在于,包括:
    获取被监测网页页面的坐标系;
    通过所述坐标系记录所述网页页面上的点击量;
    通过所述点击量确定所述网页页面上的热点区域;以及
    统计确定的所述热点区域的个数。
  2. 根据权利要求1所述的数据处理方法,其特征在于,通过所述点击量确定所述网页页面上的热点区域包括:
    获取所述点击量的密度大于预定阈值的区域;以及
    将获取的区域作为所述热点区域。
  3. 根据权利要求2所述的数据处理方法,其特征在于,获取所述点击量的密度大于预定阈值的区域包括:
    在所述网页页面的二维平面空间中,任选一个点,然后以所述点为圆心,r为半径做圆,其中,落在所述圆内的所有被点击点与圆心都会产生一个向量,所述向量是以所述圆心为起点,以落在所述圆内的所述被点击的点为终点,所述向量包括一个或者多个;
    将一个或者多个所述向量相加,得到均值飘移Meanshift向量;
    判断所述Meanshift向量的模是否小于预设极值;
    当判断出所述Meanshift向量的模小于所述预设极值时,则通过所述Meanshift向量获取所述点击量的密度和独立连续区域;
    判断所述点击量的密度是否大于所述预定阈值;以及
    在所述独立连续区域中获取所述点击量的密度大于所述预定阈值的区域。
  4. 根据权利要求3所述的数据处理方法,其特征在于,通过以下方式获取所述点击量的密度:
    获取所述独立连续区域中各个区域的点击次数;以及
    将所述点击次数除以对应的区域得到所述对应的区域的点击量密度。
  5. 根据权利要求3所述的数据处理方法,其特征在于,通过以下方式得到所述独立连续区域:
    对所述网页页面内的圆心坐标位置进行分类汇总;以及
    将所述网页页面内指向相同终点的圆心坐标集合所构成的区域记录为所述独立连续区域。
  6. 一种用于网页页面点击量统计的数据处理装置,其特征在于,包括:
    获取单元,用于获取被监测网页页面的坐标系;
    记录单元,用于通过所述坐标系记录所述网页页面上的点击量;
    确定单元,用于通过所述点击量确定所述网页页面上的热点区域;以及
    统计单元,用于统计确定的所述热点区域的个数。
  7. 根据权利要求6所述的数据处理装置,其特征在于,所述确定单元包括:
    获取模块,用于获取所述点击量的密度大于预定阈值的区域;以及
    确定模块,用于将获取的区域作为所述热点区域。
  8. 根据权利要求7所述的数据处理装置,其特征在于,所述获取模块包括:
    向量处理子模块,用于在所述网页页面的二维平面空间中,任选一个点,然后以所述点为圆心,r为半径做圆,其中,落在所述圆内的所有被点击的点与圆心都会产生一个向量,所述向量是以所述圆心为起点,以落在所述圆内的所述被点击的点为终点,所述向量包括一个或者多个;
    均值子模块,用于将所述多个向量相加,得到均值飘移Meanshift向量;
    第一判断子模块,用于判断所述Meanshift向量的模是否小于预设极值;
    第一获取子模块,用于当判断出所述Meanshift向量的模小于所述预设极值时,则通过所述Meanshift向量获取所述点击量的密度和独立连续区域;
    第二判断子模块,用于判断所述点击量的密度是否大于所述预定阈值;以及
    第二获取子模块,用于在所述独立连续区域中获取所述点击量的密度大于所述预定阈值的区域。
  9. 根据权利要求8所述的数据处理装置,其特征在于,所述获取模块用于获取所述独立连续区域中各个区域的点击次数,将所述点击次数除以对应的区域得到所述对应的区域的点击量密度。
  10. 根据权利要求8所述的数据处理装置,其特征在于,所述获取模块用于对所述网页页面内的圆心坐标位置进行分类汇总,将所述网页页面内指向相同终点的圆心坐标集合所构成的区域记录为所述独立连续区域。
PCT/CN2014/090189 2013-11-06 2014-11-03 用于网页页面点击量统计的数据处理方法和装置 WO2015067154A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/033,953 US10083251B2 (en) 2013-11-06 2014-11-03 Data processing method and apparatus for counting webpage hits

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310547813.8A CN103530431B (zh) 2013-11-06 2013-11-06 用于网页页面点击量统计的数据处理方法和装置
CN201310547813.8 2013-11-06

Publications (1)

Publication Number Publication Date
WO2015067154A1 true WO2015067154A1 (zh) 2015-05-14

Family

ID=49932440

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/090189 WO2015067154A1 (zh) 2013-11-06 2014-11-03 用于网页页面点击量统计的数据处理方法和装置

Country Status (3)

Country Link
US (1) US10083251B2 (zh)
CN (1) CN103530431B (zh)
WO (1) WO2015067154A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530431B (zh) * 2013-11-06 2016-08-17 北京国双科技有限公司 用于网页页面点击量统计的数据处理方法和装置
CN104881408A (zh) * 2014-02-27 2015-09-02 腾讯科技(深圳)有限公司 页面点击次数统计及结果展示方法、装置和系统
CN106484700B (zh) * 2015-08-25 2019-09-20 北京国双科技有限公司 页面访问数据的显示方法和装置
CN106250404A (zh) * 2016-07-21 2016-12-21 柳州龙辉科技有限公司 一种用户操作分析的方法
CN108073597A (zh) * 2016-11-10 2018-05-25 北京国双科技有限公司 页面点击行为展示方法、装置和系统
CN110020347A (zh) * 2017-09-13 2019-07-16 北京国双科技有限公司 一种自动判断网页区域价值的方法及装置
CN110020351B (zh) * 2017-09-29 2021-08-13 北京国双科技有限公司 点击热力图异常检测方法及装置
CN109962983B (zh) * 2019-03-29 2021-11-23 北京搜狗科技发展有限公司 一种点击率统计方法及装置
CN116821554B (zh) * 2023-08-30 2023-11-14 中航金网(北京)电子商务有限公司 一种页面分析方法、装置、电子设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777080A (zh) * 2010-03-19 2010-07-14 北京国双科技有限公司 基于用户点击数据的网页分析方法
CN102830922A (zh) * 2012-08-07 2012-12-19 晶赞广告(上海)有限公司 一种广告点击效果的热点数据可视化方法
US20130166394A1 (en) * 2011-12-22 2013-06-27 Yahoo! Inc. Saliency-based evaluation of webpage designs and layouts
CN103530431A (zh) * 2013-11-06 2014-01-22 北京国双科技有限公司 用于网页页面点击量统计的数据处理方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6907566B1 (en) * 1999-04-02 2005-06-14 Overture Services, Inc. Method and system for optimum placement of advertisements on a webpage
US20090222454A1 (en) * 2005-12-21 2009-09-03 International Business Machines Corporatin Method and data processing system for restructuring web content
CN101299688B (zh) * 2008-06-13 2010-12-22 北京缔元信互联网数据技术有限公司 一种获取网页区域点击数量的方法
US20120010995A1 (en) * 2008-10-23 2012-01-12 Savnor Technologies Web content capturing, packaging, distribution
US8234370B2 (en) * 2009-06-30 2012-07-31 International Business Machines Corporation Determining web analytics information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777080A (zh) * 2010-03-19 2010-07-14 北京国双科技有限公司 基于用户点击数据的网页分析方法
US20130166394A1 (en) * 2011-12-22 2013-06-27 Yahoo! Inc. Saliency-based evaluation of webpage designs and layouts
CN102830922A (zh) * 2012-08-07 2012-12-19 晶赞广告(上海)有限公司 一种广告点击效果的热点数据可视化方法
CN103530431A (zh) * 2013-11-06 2014-01-22 北京国双科技有限公司 用于网页页面点击量统计的数据处理方法和装置

Also Published As

Publication number Publication date
US10083251B2 (en) 2018-09-25
US20160283609A1 (en) 2016-09-29
CN103530431A (zh) 2014-01-22
CN103530431B (zh) 2016-08-17

Similar Documents

Publication Publication Date Title
WO2015067154A1 (zh) 用于网页页面点击量统计的数据处理方法和装置
CN108345642B (zh) 采用代理ip爬取网站数据的方法、存储介质和服务器
JP6403787B2 (ja) Ipアドレスに対応する位置を決定するための方法、装置、および、システム
Li et al. Location-aware publish/subscribe
US20180006993A1 (en) Content delivery in a location-based messaging platform
CN104281701B (zh) 分布式多尺度空间数据查询方法及系统
CN110795458B (zh) 交互式数据分析方法、装置、电子设备和计算机可读存储介质
WO2017215175A1 (zh) 页面处理方法、装置、终端及服务器
CN110020273B (zh) 用于生成热力图的方法、装置以及系统
WO2016177280A1 (zh) 记录及还原网页中点击位置的方法和装置
WO2017156994A1 (zh) 多媒体资源的质量评估方法和装置
CN111061758B (zh) 数据存储方法、装置及存储介质
TW202027003A (zh) 受理區塊鏈存證交易的方法及系統
WO2016165542A1 (zh) 缓存命中率分析的方法及设备
WO2014101507A1 (zh) 一种在线用户分布的处理方法、装置以及存储介质
CN102508884A (zh) 热点事件与实时评论的获取方法及装置
CN103559278A (zh) 用于网页页面点击量统计的数据处理方法和装置
CN110309143A (zh) 数据相似度确定方法、装置及处理设备
JP2010020642A5 (zh)
CN113283351A (zh) 一种使用cnn优化相似度矩阵的视频抄袭检测方法
CN105824279A (zh) 机房监控系统构建灵活有效cmdb的方法
WO2019019596A1 (zh) 断点名单的处理方法、装置、服务器及介质
CN107944001A (zh) 热点新闻的检测方法、装置及电子设备
WO2021092771A1 (zh) 一种目标检测方法及装置、设备、存储介质
CN103559277A (zh) 用于网页页面点击量统计的数据处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14860615

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15033953

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.09.2016)

122 Ep: pct application non-entry in european phase

Ref document number: 14860615

Country of ref document: EP

Kind code of ref document: A1